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Reference To A Related Application 



The present application is a continuation-in-part of our copending U.S. Patent Application Serial No. 
07/866,045, filed on April 9, 1992, which is incorporated by reference in its entirety. 

5 

Background of the Invention 

The present Invention concerns non-A, non-B hepatitis (hereinafter called NANB hepatitis) virus 
genome, polynucleotides, polypeptides, related antigen, antibody and detection systems for detecting 
10 NANB antigens or antibodies. 

Viral hepatrtis of which DNA and RNA of the causative viruses have been elucidated, and their 
diagnosis and even prevention in some have been established, are hepatitis A and hepatitis B. The general 
name NANB hepatitis was given to the other forms of viral hepatitis. 

Post-transfusion hepatitis was remarkably reduced after introduction of diagnostic systems for screening 
15 hepatitis B in transfusion bloods. However, there are still an estimated 280,000 annual cases of post- 
transfusion hepatitis caused by NANB hepatitis in Japan. 

NANB hepatitis viruses were recently named C.D and E according to their types, and scientists started 
a world wide effort to conduct research for the causative viruses and subsequent extermination of the 
causative viruses. 

20 In 1988, Chiron Corp. claimed that they had succeeded in cloning RNA virus genome, which they 
termed hepatitis C virus (hereinafter called HCV). as the causative agent of NANB hepatitis and reported on 
its nucleotide sequence (British Patent 2,212,511 which is the equivalent of European Patent Application 
0.318,216). HCV (C100-3) antibody detection systems based on the sequence are now being introduced for 
screening of transfusion bloods and for diagnosis of patients in Japan and in many other countries. The 

25 detection systems for the CI 00-3 antibody have proven their partial association with NANB hepatitis; 
however, they capture only about 70% of carriers and chronic hepatitis patients, or they fail to detect the 
antibody in acute phase infection, thus leaving problems yet to be solved even after development of the 
CI 00-3 antibody by Chiron Corp. 

The course of NANB hepatitis is troublesome and most patients are considered to become carriers, 

30 then to develop chronic hepatitis. In addition, most patients with chronic hepatitis develop liver cirrhosis, 
then hepatocellular carcinoma. It is therefore very imperative to isolate the virus itself and to develop 
effective diagnostic reagents enabling earlier diagnosis. 

The presence of a number of NANB hepatitis which cannot be diagnosed by Chiron's CI 00-3 antibody 
detection kits suggests a possibility of a difference in subtype between Chiron's HCV and Japanese NANB 

35 hepatitis virus. 

In order to develop NANB hepatitis diagnostic kits of more specificity and to develop effective vaccines, 
it becomes an absolutely important task to analyze each subtype of NANB hepatitis causative virus at its 
genetic and corresponding amino acid level. 

40 Summary of the Invention 



An object of the present invention is to provide the nucleotide sequence coding for the structural protein 
of NANB hepatitis virus and, with such information, to analyze amino acids of the protein to locate and 
provide polypeptides useful as antigen for establishment of detection systems for NANB virus, its related 

45 antigens and antibodies. 

A further object of the present invention is to locate polynucleotides essential to treatment, prevention 
and diagnosis, and polypeptides effective as antigens, by isolating NANB hepatitis virus RNA from human 
and chimpanzee virus carriers, cloning the cDNA covering the whole structural gene of the virus to 
determine its nucleotide sequence, and studying the amino acid sequence of the cDNA. As a result, the 

50 inventors have determined the nucleotides of the whole genome of a strain of NANB virus called HC-J6 and 
a strain called HC-JB. NANB hepatitis virus genome of HC-J6 and HC-J8 differ from that of Chiron's HCV. 

Brief Description of the Drawings 



55 Figure 1 shows the restriction map and structure of the coding region of NANB hepatitis virus genome 
(HC-J6) and positions of clones. C. E, NS-1, NS-2, NS-3, NS-4 and NS-5 are the abbreviation of core, 
envelope, non-structure-1 , -2, -3, -4 and -5. 
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Rgures 2 to 4 show method of determination of the nucleotide sequence of 5' terminus of NANB 
hepatitis virus genome of strains HC-J1 , HC-J4 and HC-J6 respectively. 

Rgure 5 shows the method of determination of the nucleotide sequence of 3' terminus of HC-J6 
genome. Solid lines show nucleotide sequences determined by clones from libraries of bacteriophage 
5 lambda gt10, and broken lines show nucleotide sequences determined by clones obtained by PGR. 

Rgure 6 shows the structure of coding region of NANB hepatitis virus genome (HC-J8) and positions of 
clones. Regions a to n indicate positions of amplification by PGR. 

Detailed Description of the Invention 

10 

The present invention provides NANB hepatitis virus genome RNA for strain HC-J6 (sequence list 1) 
consisting of 340 nucleotides on the 5' terminus that follow an open reading frame consisting of 9099 
nucleotides coding for the structural protein and non-structural protein that follow a noncoding region 
consisting of 150 nucleotides containing an U-stretch consisting of 108 uracils on the 3* terminus of NANB 
75 hepatitis virus, and NANB hepatitis virus genome having substantially the nucleotide sequence of sequence 
listl. 

The present invention provides polynucleotide N-9589 (strain HG-J6) comprising the DNA nucleotide 
sequence of sequence list 2; cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3; 
cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4; and NANB hepatitis virus 
20 polynucleotides having substantially the sequence of nucleotides of NANB hepatitis virus nucleotides shown 
in sequence lists 2 through 4. 

The invention provides polypeptide coded for by genome or polynucleotide of HC-J6 above, polypep- 
tide P-J6-3033. comprising the polypeptide sequence of sequence list 5. polypeptides produced by using 
recombinant genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA 
25 above, and polyclonal or monoclonal antibodies against the polypeptides described above. 

The present invention also provides NANB hepatitis virus genome for strain HG-J8 comprising 
sequence list 6, NANB hepatitis virus RNA consisting of noncoding region consisting of 341 nucleotides on 
5* tenminus followed by an open reading frame consisting of 9099 nucleotides coding for the structural 
protein and non-structural protein followed by a noncoding region consisting of 71 nucleotides containing an 
30 U-stretch consisting of 30 uracils on 3' terminus of NANB hepatitis virus comprising sequence list 6, and 
NANB hepatitis virus genome having substantially the nucleotide sequence of sequence list 6. 

The present invention provides polynucleotide N-951 1 for strain HC-J8 comprising the DNA nucleotide 
sequence of sequence list 7 and NANB hepatitis virus polynucleotide having substantially the sequence of 
nucleotides of NANB hepatitis vims nucleotides comprising sequence list 7. 
35 The invention provides polypeptide coded for by genome or polynucleotide of HC-JB above, polypep- 
tide P-J8-3033. comprising the polypeptide sequence of sequence list 8 and polypeptide P-J8-3033-2 
comprising the polypeptide sequence of sequence list 9, polypeptides produced by using recombinant 
genome, recombinant polynucleotides and recombinant cDNA of whole or a part of cDNA above, and 
polyclonal or monoclonal antibodies against the polypeptides described above. 
40 The present invention, furthermore, provides NANB hepatitis diagnostic system using polypeptides or 
antibodies described above. 

In the method described below, NANB hepatitis virus RNA of the present invention was obtained and its 
nucleotide sequence was determined. 

Plasma samples (HC-J1 , HG-J4. HG-J6 and HC-J8) were obtained from human and chimpanzee. HG- 
45 J1, HG-J6 and HG-J8 were obtained from Japanese blood donors who had tested positive for HGV 
antibody. HG-J4 was obtained from the chimpanzee subjected to the challenge test but was negative for 
Chiron's CI 00-3 antibody previously mentioned. 

RNA was isolated from each of the plasma samples. Following the study of 5' terminus of approxi- 
mately 2,500 nucleotides and 3' terminus of approximately 1,100 nucleotides disclosed in Japanese patent 
50 application No. 196175/91, the inventors have completed the study of the region coding for non-structural 
protein of strain HOJ6 and the study of the full length sequence of 9,589 nucleotides of HG-J6 genome 
RNA and have completed the study of the region coding for non-structural protein of strain HC-J8 and the 
study of the full length sequence of 9,589 nucleotides of HC-J8 genome RNA. 

As described in the Example below, strain HC-J6 had a 5' noncoding region consisting of 340 
55 nucleotides, and strain HC-J8 had a 5' noncoding region consisting of 341 nucleotides, followed by region 
coding for structural protein and region coding for non-structural protein. 

Concerning the 3* terminus, strain HG-J6 was found to have a region consisting of 150 nucleotides 
containing an U-stretch consisting of 1 08 uracils following after the region coding for non-structural protein 
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and strain HC-J8 was found to have a region consisting of 71 nucleotides containing an U-stretch consisting 
of 30 uracils following after ttie region coding for non-structural protein. 

The coding region starting with adenine (341st nucleotide from the 5' terminus for strain HC-J6 and 
342nd nucleotide from the 5* terminus for strain HC-J8) was found to have a long Open Reading Frame 
5 consisting of 9099 nucleotides which codes for 3033 amino acids. HCV or hepatitis C virus is supposed to 
be closely allied to flavivirus in regard to its genetic structure. The coding of the NANB hepatitis virus 
genome of the present invention was considered to be consisting of regions named C (core), E (envelope), 
NS-1 (non-structural-1), NS-2 (non-structural-2). NS-3 (non-structural-3), NS-4 (non-structural-4) and NS-5 
(non-structural-5). 

70 As compared with the sequence of HCV disclosed in the European Patent Application by Chiron Corp. 
(Publication No. 388,232), homology of sequences of the strain HG-J6 was 67.9% for the full nucleotide 
sequence and 72.3% for the full amino acid sequence, and homology of sequences of the strain HG-J8 was 
66.4% for the full nucleotide sequence and 71 .0% for the full amino acid sequence. 

From an examination of homology for regions, the homology of nucleotide sequences (strain HC-J6) of 

15 the 5* terminal noncoding region was 94.4% and that of the amino acid sequences of the C region was 
90.1%, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to b© as low as 60.4% for E, 71.1% for NS-1, 57.8% for 
NS-2, 81,1% for NS-3, 73.1% for NS-4, and 69.9% for NS-5. As a result, HG-J6 strain was found to be 
significantly different from HCV strain found by Chiron Corp. 

20 From an examination of homology for regions, the homology of nucleotide sequences (strain HC-J8) of 
the 5* terminal noncoding region was 93.8% and that of the amino acid sequences of the C region was 
90.1%, showing comparatively high homology; on the other hand, concerning lower stream than envelope, 
homologies of amino acid sequence were found to be as low as 54.7% for E, 73.1% for NS-1, 55.6% for 
NS-2, 81.3% for NS-3, 72.1% for NS-4. 67.3% for NS-5. and 25.9% for 3* terminal noncoding region. As a 

25 result, HC-J8 strain was found to be significantly different from HCV strain found by Chiron Corp. 

From the comparison of amino acid sequence of HC-J6 strain with strain HC-J1 (American type) and 
strain HC-J4 (Japanese type) disclosed by the inventors (Japan. J. Exp. Med. (1990), 60: 167-177). 
homology in the core region was more than 90% for each strain while that in the envelope region was 
60.9% for HC-J1 and 53.1% for HC-J4. Thus, in the present invention, strain HC-J6 was found to be a 

30 different type of virus than strains HC-J1 or HC-J4. 

From the comparison of amino acid sequence of HC-J8 strain with strain HC-J1 (type I) and strain HC- 
J4 (type II), homology of approximately 3,000 nucleotides of 5' terminus was 70.1% for HC-J1 and 67.1% 
for HC-J4, and from the comparison of all nucleotides with HC-J6 (type III) genome homology was as low 
as 76.9%. On the other hand, HC-J8 showed high homology with strain HC-J7 (type IV) disclosed in 

35 Japanese patent application 196175/91 as 93.1% for approximately 3,000 nucleotides of 5* terminus. 

Nucleotides among stains assumed to belong to same type were supposed to show high homology. For 
example, homology of 95.6% for approximately 3,000 nucleotides of 5' terminus between HCV disclosed by 
Chiron Corp. and HC-J1 appears to show that they should be classified into type I. On the other hand, low 
homology of HC-J8 with HCV, HC-J1 , HC-J4 and HC-J6 appeared to show that it was not to be classified 

40 into type I, II or III, but into type IV (the same as HC-J7). 

Strain HC-J8 has some mutations in the nucleotides as shown in sequence lists 6 and 7 by symbols M, 
R, W, S. Y, K and B. It also can be easily understood that it has some mutations of amino acids from 
comparison of sequences in sequences lists 8 and 9. Mutation of nucleotides was observed up to 
approximately 1.4% in the whole genome and that of amino acids was observed up to approximately 1.7% 

45 in whole ORF. Thus the present invention includes genomes, polynucleotides and polypeptides of strain 
HC-J8 having some mutations. 

In addition, envelope (E) region (576 nucleotides/192 amino acids of amino acids 192-383) and NS-1 
region (1050 nucleotides/350 amino acids of amino acids 384-733) having many mutations in HC-J8 are 
called hyper-variable region since mutations were observed as 20 nucleotides/7 amino acids (3.47%/3.64%) 

50 in E region and 37 nucleotides/19 amino acids (3.52%/5.42%) in NS-1 region. According to these findings, 
the present invention can be recognized to include genomes and polypeptides coded for by the genomes 
of strain HC-J8 having mutations of 3-5% to 5.5% in those regions. 

The genome, polynucleotide, and cDNA clones of the present invention can be used as material to 
produce peptides of the invention by integration into a host genome, e.g. E. coli or Bacillus, by means of 

55 known genetic engineering techniques. 

Polypeptides of the invention are useful as material for diagnostic agents to detect NANB hepatitis 
antibodies with high specificity and as material to produce polyclonal and monoclonal antibodies by known 
techniques. 
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Polyclonal and monoclonal antibodies of the invention are useful as materials for diagnostic agents to 
detect NANB hepatitis antigens with high specificity. 

A detection system using each polypeptide of the present invention or polypeptide with partial 
replacement of amino acids, and a detection system using monoclonal or polyclonal antibodies to such 
5 polypeptides, are useful as diagnostic agents of NANB hepatitis with high specificity and are effective to 
screen out NANB hepatitis virus from transfusion bloods or blood derivatives. The polypeptides, or 
antibodies to such polypeptides, can be used as a material for a vaccine against NANB hepatitis virus. 

It is well known in the art that one or more nucleotides in a DNA sequence can be replaced by other 
nucleotides in order to produce the same protein. The present invention also concerns such nucleotide 
10 substitutions which yield DNA sequences which code for polypeptides as described above. It is also well 
known in the art that one or more amino acids in an amino acid sequence can be replaced by equivalent 
other amino acids, as demonstrated by U.S. Patent No. 4,737,487 which is incorporated by reference, in 
order to produce an analog of the amino acid sequence. Any analogs of the polypeptides of the present 
invention involving amino acid deletions, amino acid replacements, such as replacements by other amino 
?5 acids, or by isosteres (modified amino acids that bear close structural and spatial similarity to protein amino 
acids), amino acid additions, or isosteres additions can be utilized, so long as the sequences elicit 
antibodies recognizing NANB antigens. 

Examples of application of this invention are shown below, however, the invention shall in no way be 
limited to those examples. 

20 

Examples 



The 5* terminal nucleotide sequence and amino acid sequence of NANB hepatitis virus genome were 
determined in the following way: 

25 

(1) Isolation of RNA 

RNA of the sample (HC-J1 , HC-J6, HC-J8) from plasma of Japanese blood donor testing positive for 
HCV (C100-3) antibody (by Ortho HCV Ab ELISA, Ortho Diagnostic System. Tokyo), and that of the sample 
30 (HC-J4) from the chimpanzee challenged with NANB hepatitis for infectivity and negative for HCV antibody 
were isolated in the following method: 

Each plasma sample was added with Tris chloride buffer (10 mM, pH 8.0) and centrifuged at 68 x 10^ 
rpm for 1 hour. Its precipitate was suspended in Tris chloride buffer (50 mM, pH 8.0) containing 200 mM 
NaCI, 10 mM EDTA, 2% (w/v) sodium dodecyl sulfate (SDS), and proteinase K 1 mg/ml, incubated at 60*C 
35 for 1 hour, then their nucleic acids were extracted by phenol/chloroform and precipitated by ethanol to 
obtain RNA. 

(2) HC-J1 and HC-J8 cDNA Synthesis 

40 After heating the RNA isolated from HC-J1 or HC-J8 plasma at 70''C for 1 minute, this was used as a 
template; 10 units of reverse transcriptase (cDNA Synthesis System Plus, Amersham Japan) and 20 pmol 
of oligonucleotide primer (20 mer) were added and incubated at 42*C for 1.5 hours to obtain cDNA. Primer 
#8 (5'- GATGCTTGCGGAAGCAATCA - 3') was prepared by referring to the basic sequence shown in 
European Patent Application No. 88310922.5, which is relied on and incorporated herein by reference. 

45 

(3) cDNA Was Amplified by the following Polymerase Chain Reaction (PGR) 

cDNA was amplified for 35 cycles according to Saiki's method (Science (1988) 239: 487-491) using 
Gene Amp DNA Arriplifier Reagent (Perkin-Elmer.Cetus) on a DNA Thenmal Cycler (Perkin-Elmer.Cetus). 
50 For cDNA synthesis and for PGR for HC-J8. synthesized primers disclosed in Japanese patent 
application 153402/90 and those based on HC-J1, HC-J4 and HC-J6 genomes disclosed in Japanese patent 
applications 196175/91 and below were utilized. 

(4) Determination of 5* Terminal Nucleotide Sequence of HC-J1 and HC-J4 by Assembling cDNA Clones 

55 

As shown in Figures 2 and 3, nucleotide sequences of 5' termini of the genomes of strains HC-J1 and 
HC-J4 were determined by combined analysis of clones obtained from the cDNA library constructed in 
bacteriophage XgtIO and clones obtained by amplification of HCV specific cDNA by PCR. 
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Figures 2 and 3 show 5' termini of NANB hepatitis virus genome together with cleavage site by 
restriction endonuclease and sequence of primers used. In the figures, solid lines are nucleotide sequences 
determined by clones from bacteriophage XgtIO library while dotted lines show sequences determined by 
clones obtained by PGR. 

5 A 1656 nucleotide sequence of HC-J1 spanning nt454-2109 was determined by clone 041 which was 
obtained by inserting the cDNA synthesized with the primer #8 into Xgt10 phage vector (Amersham). 

Another primer #25 (5*- TCCCTGTTGCATAGTTCACG -3') corresponding to nt824-843 was synthesized 
based on the 041 sequence, and four clones (060, 061 , 066 and 075) were obtained to cover the upstream 
sequence ntl 8-843. 

70 

(5) Determination of 5* Terminal Nucleotide Sequence of HC-J6. 

The nucleotide sequence of the 5' terminus of strain HC-J6 was determined from analysis of clones 
obtained by PGR amplification as shown in Figure 4. 
75 Isolation of RNA from HC-J6 and determination of its sequence was made in the same manner as 
described in (2) above. Sequences in the range of nt24-2551 of the RNA were determined from consensus 
sequence of respective clones obtained by amplification by PGR using each pair of primers based on 
nucleotide sequence of HC-J4. 

nt24-826 

#32 ( 5 ' -ACTCCACCATAGATCACTCC-3 * ) 
#122 ( 5 • -AGGTTCCCTGTTGCATAATT- 3 * ) 
Clones: C9397, C9388, C9764 

nt732-1907 

#50 { 5 • -GCCGACCTCATGGGGTACAT-3 ' ) 
#128 (5'-TCGGTCGTGCCCACTACCAC-3* ) 



Clones: C9316 , C9752 ,C9753 

40 



ntl847-2571 

» 149 ( 5 ■ -TCTGTGTGTGGCCCAGTGTA- 3 ' ) 
#146 { 5 ' -AGTAGCATCATCCACAAGCA-3 ' ) 
Clones: C11621 , C11624 , C11655 

50 

In order to determine further upstream of the 5' terminus, antisense primer #36 (5*- AACACTACTCGG- 
CTAGCAGT -3') corresponding to nt246-265, followed by dAs were added to 5' terminus of cDNA using 
terminal deoxynucleotidyl transferase, and one-sided PGR amplification was made twice as described 
55 below. 

cDNA was amplified for 35 cycles as first stage PGR using oligo dT primer (20-mer) and antisense 
primer #48 (5'-GTTGATGGAAGAAAGGACCC -3') of ntl 88-207, followed by the second stage of PGR by 30 
cycle amplification using the first PGR product as a template, oligo dT primer (20 -mer) and antisense 
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by PGR for 35 cycles; and in the second stage, using the product of the first-stage PGR as a template, non- 
specific primer #166 (5' AAGGATCCGTCGACATCGAT -3') and antisense primer #109 (21-mer; 5'-ACCG- 
GATCCGCAGACCACTAT -3*) were added to initiate PGR for 30 cycles. The product of PGR was subcloned 
to M13 phage vector. 

5 TTiirteen independent clones (poly dT-tailed: G1 4951, CI 4952. CI 4953. G14958, CI 4960, CI 4968, 
CI 4971. C14972 and CI 4974; poly dA-tailed: C14987. CI 4996. C14999 and CI 5000) were obtained (each 
considered having complete length of 5' terminus), and the consensus sequence of ntl-139 of the 
respective clones was determined. 

10 (10) cDNA amplification of ORF region and 3* terminus by PGR 

As shown in Rgure 6, the nucleotide sequence of downstream from nt140 of HC-J8 genome was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PGR. 

Single-stranded cDNAs to HC-J8 RNA were synthesized in the same manner as (2) above using 
75 antisense primers described below, then they were amplified by PGR using sense and antisense primers 
described below. Each product of PGR was sutcloned to Ml 3 phage vector, then consensus sequence of 
the respective clones of each region was determined. 

The primers for cDNA synthesis and PGR amplification, and the numbers of obtained clones are shown 
below for each region. Alphabetical symbol of each amplified region corresponds to that in Figure 6. 
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b region 
nt45-847 

Primer for cDNA synthesis: #122 (5' -AGGTTCCCTGTTGCATAATT-3' ) 
Primer for PGR: sense: »32A (5 » -CTGTGAGGAACTACTGTCTT-3 ' ) 

antisense #122 
Clones: C15221 ,015222 ,015223 

c region 
nt732-1354 

Primmer for cDNA synthesis:#54 (5 ' -ATCGCGTACGOCAGGATCAT-3 • ) 
Primer for PGR: sense: #50 ( 5 ' -GCOGATOTCATGGGGTACAT-3 ' ) 

antisense:#54 
Clones: 015256,015257,015258 

d region 
ntl300-1879 

Primer for cDNA synthesis: #199 (5 ' -GGGGTGAAACAATAOAOOGG-3 ' ) 
Primer for PGR: sense: #205 (5 • -GGGAOATGATGATOAAOTGG-3 ' ) 

antisense: #199 
Clones: 014221,014222,014223 
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e region 
ntl833-2518 

Primer for cDNA synthesis: #146 ( 5 ' -AGTAGCATCATCCACAAGCA-3 ' ) 
Primer for PGR: sense: #150 {5 • -ATCGTCTCGGCTAAGACGGT-3 * ) 

antisense: #146 
Clones: C11535,C11540,C11566 

f region 
nt2433-3451 

Primer for cDNA synthesis: #170 (5 * -GCATAAGCAGTGATGGGGGC-3 ' ) 
Primer for PGR: sense: #160 (5 ' -CAGAACATCGTGGACGTGCA-3 ' ) 

antisense: #170 
Clones: C15348,C15349,C15356 

q region 
nt3404-4300 

Primer for cDNA synthesis: #225 (5 ' -TCGCATATGATGATGTCATA-3 ' ) 
Primer for PGR: sense: #238 (5* -CTACACGTCCAAGGGGTGGA-3' ) 

antisense: #225 
Clones: C15701,C15702,C15703 

h region 
nt4221-5015 

Primer for cDNA synthesis: #216 {5' -GTGGTCTAGACATACGGGCA-3 ' ) 

Primer for PGR: sense: #230 {5' -GCCATCACGTACTCGACATA-3 ' ) 

antisense: #216 
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primer #109 (21-mer; 5'-ACCGGATCCGCAGACCACTAT-3') corresponding to nt140 to 160. The obtained 
PGR product was subcloned to M13 phage vector. 

Nucleotide sequence from nt1 to 23 was determined from consensus sequence of 13 isolated clones 
C9577, 09579, C9581, 09587, C9590, 09591, C9595, 09606. 09609, 09615, 09616 and 09619 obtained 
5 above which were considered having complete 5' terminus. 

(6) Determination of nucleotide sequence of HG-J6 middle region. 



cDNA library was constructed with using XgtIO according to the method described in (2) above from 
10 100ml of HC-J6 plasma as a starting materials. Primers #162 and #81 were prepared for synthesis by 
referring to the basic sequence shown in the European Patent Application Publication No. 318,216. Clones 
were selected by plaque hybridization. 

Nucleotide sequence from 2552 to 8700 was determined from consensus sequence of four obtained 
cDNA clones 02 (nt6996 to 8700), 06(nt6485 to 8700), 08(nt6OO8 to 8700) and 081 (nt2l99 to 6168) as 
75 shown in Figure 1 . Olones 081 and 08 were found to have nucleotide sequences shown in sequence lists 3 
and 4 respectively. 

(7) [determination of 3' terminal nucleotide se^quence of H0-J6 strain. 



20 As shown in Figure 5, the nucleotide sequence of the 3* terminus of H0-J6 genome was determined by 

analysis of clones obtained by amplification of HOV specific cDNA by PGR. 

Nucleotide sequence of HC-J6 from nt8701 to 9241 was determined from consensus sequence of three 

clones consisting of 938 nucleotides, 09760, 09234 and 09761 , obtained by amplification of sample using 

primer #80 (5'-GA0A0C0GCTGTTTTGA0T0-3*) and #60 (5*-GTT0TTACTG000AGTTGAA-3'). 
25 Nucleotide sequence of 3' terminus down stream from nt9242 was determined in the method described 

below. 

Isolation of RNA from H0-J6 was made in the same manner as described in (1) above. The obtained 
RNA was added poly (A) to its 3' terminus using poly (A) polymerase and cDNA was synthesized using 
oligo (dT)2o as a primer, and obtained cDNA was provided to PGR as a template. 

30 First PGR product was made with using #97 (5*-AGTGAGGGCGTCCCTCATCT-3') as a sense primer 
and oligo (dT)2o as an antisense primer. Second PGR product was made with using #90 {5'- 
GCOGTTTGCGGOOGATATOT-3') corresponding to downstream sequence of #97 as a sense primer, and 
oligo {dT)2o as an antisense primer as well as first PGR product. PGR product obtained by two step 
amplification was smoothened on both ends by treatment with T4DNA polymerase, followed by 

35 phosphorylation of 5'terminus by T4 polynucleotide kinase. The obtained product was subcloned into Mine 11 
position of M13mp1 9 phage vector. 

Nucleotide sequence of 3' terminus was determined from consensus sequence of 19 obtained clones, 
010311, CI 031 3. 010314, 010320, 010322, 010323, 010326, 010328, 010330, 010333, 010334, 010336, 
010337, 010345, 010346. CI 0347, 010349, 010350 and C10357. 

40 As a result, the nucleotide sequence of cDNA to HG-J6 genome RNA was determined as shown in 
sequence list 2, and full sequence of genome RNA was determined as shown in sequence list 1 . 

(8) Determination of amino acid sequences. 

45 According to the nucleotide sequence of the genome of strain HG-J6, determination was made of 
sequence of coded region starting with ATG. As a result, H0-J6 genome was found to have a long Open 
Reading Frame coding for polypeptide precursor consisting of 3033 amino acid residues. 

(9) Determination of 5* terminal nucleotide sequence of HC-J8 

50 " ~~~ 

As shown in Rgure 6, the nucleotide sequence of 5' tennlnus of H0-J8 genome (a region) was 
determined by analysis of clones obtained by amplification of HCV specific cDNA by PGR. 

Single-stranded cDNA was synthesized using antisense primer #36 (5'-AACA0TAGT0GGGTAGGAGT- 
3*) of nt246 to 265 in the same manner as (2) above, then it was added with dATP tail at its 3' terminus by 
55 terminal deoxynucleotidyl transferase, then amplified by one-sided PGR In two stages. 

That is, in the first stage, antisense primer #48 (5'-GTTGATOCAAGAAAGGAGOG-3') of nt188 to 207 
was used with sense primer selected from non-specific primer #165 (5'-AAGGAT00GT0GA0ATCGATAAT- 
ACG (A) 17-3') and #171 (V-AAGGATGCGTCGACATCGATAATAGG(T)i7-3') to amplify the dA-tailed cDNA 
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Clones: C1539i,C15392,C15393 

i region 
nt4695-5062 

Primer for cDNA synthesis: #210 {5 ' -GCATCTATGTGTGTGAGGCC-3 ' ) 
Primer for PGR: sense: #209 ( 5 * -TTCGACTCCGTGATCGACTG-3 ' ) 

antisense: #210 
Clones : C14087 , C14088 , C14089 

j region 
nt5021-6169 

Primer for cDNA synthesis: #162 (5' -TCCGACTCCGTCACGTAGTG-3 * ) 
Primer for PGR: sense: #227 (5* -GTTCTGGGAAGCGGTCTTTA-3 ' ) 

antisense: #162 
Clones : C15421 ,015422 ,C15423 

k region 
nt6027-6889 

Primer for cDNA synthesis: #232 (5 • -GATGGGTCTGTTAGCATGGA-3 ' ) 
Primer for PGR: sense: #242 (5 " -TTGGTAGTGGGAGTCATCTG-3 * ) 

antisense: #232 
Clones : C15733 , C15734 , C15735 

1 region 
nt6834-7735 

Primer for cDNA synthesis #239 ( 5 ' -ATCGGTAACTTCTCCTCTTC-3 ' ) 
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Primer for PGR: sense; #241 (5 ' -CCTTGCGATCCTGAACCTGA-3 ' ) 

antisense: #239 
Clones: C15798 ,015799, C15800 

m region 
nt7656-8630 

Primer for cDNA synthesis: #222 { 5 ' -GACCAGGTCGTCTCCACACA-3 • ) 
Primer for PGR: sense: #229 (5' -GTCGTGTGCTGCTGCATGTC-3 ' ) 

antisense: #222 
Clones: C15376 ,C15378 ,G15381 



n region 
nt8325-9511 

Primer for cDNA synthesis: #165 

Primer for PGR: sense: #80 (5' -GACACCCGCTGTTTTGACTC-3 ' ) 

non-specific: #165 
Clones: C15270,C15271,C15272 

From the analysis described above, full nucleotide sequence of cDNA to HC-J8 was determined as 
shown in sequence list 7, then full nucleotide sequence of HC-J8 genome RNA as shown in sequence list 6. 
Two amino acid sequences shown in sequence lists 8 and 9 represent those coded for by HC-J8 genome. 

Utilizing known immunological techniques, it is possible to determine epitopes (e.g.. from the core 
region, etc.) from the polypeptides of sequence lists 5, 8 and 9. Determination of such epitopes of the 
NANB hepatitis virus opens access to chemical synthesis of the peptide, manufacturing of the peptide by 
genetic engineering techniques, synthesis of the polynucleotides, manufacturing of the antibody, manufac- 
turing of NANB hepatitis diagnostic reagents, and development of products such as NANB hepatitis 
vaccines. 

According to the well-known method described by Merrifield. NAMB peptides can be synthesized. 
Furthermore, the polynucleotides in sequence lists 2-4 and 7 can be used to express polypeptides in host 
cells such as Escherichia coli by means of genetic engineering technique. 

A detection system for antibody against NANB hepatitis virus can be developed using polyvinyl 
microtiter plates and the sandwich method. For example, 50ul of 5 ug/ml concentration of a NANB peptide 
can be dispensed in each well of the microtiter plates and incubated overnight at room temperature for 
consolidation. The microplate wells can be washed five times with physiological saline containing 0.05% 
Tween 20. For overcoating. 100 ul of NaCI buffer containing 30% (v/v) of calf serum and 0.05% Tween 20 
(CS buffer) can be dispensed in each well and discarded after incubation for 30 minutes at room 
temperature. 

For determination of NANB antibodies in samples, in the primary reaction, 50ul of the CS buffer 
containing 30% calf serum and 10 ul of a sample can be dispensed in each microplate well and incubated 
on a microplate vibrator for one hour at room temperature. After completion of the reaction, microplate wells 
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can be washed five times in the same way as previously described. 

In the secondary reaction, as labeled antibody 1 ng of horseradish peroxidase labeled anti-human IgG 
mouse monoclonal antibodies (Fab' fragment: 22G, Institute of Immunology Co., Ltd., Tokyo, Japan) 
dissolved in 50 ul of calf serum can be dispensed in each microplate well, and incubated on a microplate 
vibrator for one hour at room temperature. Wells can be washed five times in the same way. After addition 
of hydrogen peroxide (as substrate) and 50 ul of 0-phenylendiamine solution (as color developer) in each 
well, and after incubation for 30 minutes at room temperature, 50 ul of 4M sulphuric acid can be dispensed 
in each well to stop further color development and for reading absorbance at 492 nm. 

The cut-off level of this assay system can be set by measuring a number of donor samples with normal 
serum ALT (alanine aminotransferase) value of 34 Karmen unit or below and which tested negative for anti- 
HCV. 

The present invention makes possible detection of NANB hepatitis virus infection which could not be 
detected by conventional determination methods, and provide NANB hepatitis detection kits capable of 
highly specific and sensitive detection at an early phase of infection. 

These features allow accurate diagnosis of patients at an early stage of the disease and also help to 
remove at higher rate NANB hepatitis virus carrier bloods through screening test of donor bloods. 

Polypeptides and their antibodies under this invention can be utilized for manufacture of vaccines and 
immunological pharmaceuticals, and structural gene of NANB hepatitis virus provides indispensable tools 
for detection of polypeptide antigens and antibodies. 

Antigen-antibody complexes can be detected by methods known in this art. Specific monoclonal and 
polyclonal antibodies can be obtained by immunizing such animals as mice, guinea pigs, rabbits, goats and 
horses with NANB peptides (e.g., bearing NANB hepatitis antigenic epitope). 

The present invention is based on studies on isolated virus genome of NANB hepatitis virus named HC- 
J6 and HC-J8, and is completed by clarification of the full sequence of the nucleotides. The invention 
makes possible highly specific detection of NANB hepatitis virus and provision of polypeptide, polyclonal 
antibody and monoclonal antibody to prepare the test system. 

Further variations and modifications of the invention will become apparent to those skilled in the art 
from the foregoing and are intended to be encompassed by the claims appended hereto. 

Japanese Priority Applications 287402/91 filed August 9, 1991 and 360441/91 filed on December 5, 
1991 are relied on and incorporated by reference. U.S. patent applications serial no. 07/540,604 (filed June 
19, 1990), 07/653,090 (filed February 8, 1991) , and 07/712,875 (filed June 11, 1991) are incorporated by 
reference in their entirety. 

Sequence list 



whole nucleotides of HC-J6 genome^RNA > 
N-9589 whole nucleotides of^cDNA to HCrJ,6 genome RNA 
J6-081 nucleotides of clone J6 081 
J6-08 nucleotides of clone J6-08 
P-J6-3033 whoje amino acids^ of ORF of HCfJ6 genorne 
whole nucleotides of HC-J8 genome RNA 
whole nucleotides of cDNA to HC-J8 genome RNA 
whole amino acids of a variation of ORF of HC-J8 genome 
whole amino acids of a variation of ORF of HC-J8 genome 

Claims 

1. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J6. comprising the nucleotide sequence of 
sequence list 1 . 

2. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J6, comprising the nucleotide sequence 
of siequence list 2. 

3. cDNA clone J6-081 comprising the nucleotide sequence of sequence list 3. 

4. cDNA clone J6-08 comprising the nucleotide sequence of sequence list 4. 



Sequence list ^ 
Sequence list 2: 
Sequence list 3: 
Sequence list 4: 
Sequence listsT^ 
Sequence list 6: 
Sequence list 7: 
Sequence list 8: 
Sequence list 9: 
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5. Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J6, comprising ttie amino acid sequence of sequence list 5. 

6. Recombinant RNA of non-A, non-B hepatitis virus, strain HC-J8, comprising the nucleotide sequence of 
sequence list 6. 

7. Recombinant cDNA of non-A, non-B hepatitis virus, strain HC-J8. comprising the nucleotide sequence 
of sequence list 7. 

8. Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 8. 

9. Amino acid sequence corresponding to recombinant cDNA of non-A, non-B hepatitis virus, strain HC- 
J8, comprising the amino acid sequence of sequence list 9. 

10. A non-A, non-B hepatitis diagnostic test kit for analyzing samples for the presence of antibodies 
directed against a non-A. non-B hepatitis antigen, comprising an antigen attached to a solid substrate 
and lat>eied anti-human immunoglobulin; wherein said antigen is an antigen selected from the antigens 
contained in sequence lists 5, 8 or 9. 

11. A method of detecting antibodies directed against a non-A. non-B hepatitis antigen in a sample, said 
method comprising: 

(a) reacting said sample with an antigen selected from the antigens contained in sequence lists 5. 8 
or 9 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 

12- A non-A, non-B hepatitis specific monoclonal or polyclonal antibody reactive with an antigen, said 
antigen is an antigen selected from the antigens contained in sequence lists 5, 8 or 9. 

13. A method of detecting non-A, non-B hepatitis antigen in a sample, said method comprising: 

(a) reacting said sample with the non-A, non-B hepatitis monoclonal or polyclonal antibody according 
to claim 12 to form antigen-antibody complexes; and 

(b) detecting said antigen-antibody complexes. 



14 



EP0 532 167 A2 



00 



o 
o 
o 

CO 



o 
o 
o 

CD 



o 
o 
o 



o 
o 
o 



CO 

c 
ID 



in 
CO 



CO 



CO 

CO 



CO 



CO 
2 



HI 



o 



00 



CM 

T 



cvi 
c 



00 
O 

o 

CD 



CO 

-0- 



Q- 

CO 

CD 
CM 



in 

CO 
CO 



CL 

CD 

CM 
CM 



CO 

cn 

CD 



o 
cn 

CO 



15 



EP0 532 167 A2 



CO 

to 

^ a 
o 



lis 



o o o 



CO 

oo 




S§2' 

lO in in 
CM eg CNj 
O O O 



1- CM in <M Tj- a> 

^ CO CO CO ^ ^ 



^ in oo < 
~> in in CM < 

_ o>O)0>o)a>d>^O)o>o>O4 _ 

GOCO<DCOG0GOGOGOGO0O<X}O>O>C7)O)CT> 

oooooooooooooooo 



16 



EP 0 532 167 A2 




o 
o 
o 



in 



CO 
CO 









S 








CO 


CM 


CO 






CO 


CO 


CO 




























O 









o 
. o 

= CD 



o 
o 
in 



o 
o 
o 



in 

CM 

T 



o 

CO 



CO 

°0 CO _ 

? # o 



CD O O O 
CM CM CM 
00 CD CO TT 



CO 
GO 



^ CO 

CM 

CM CO 

O O 



i CO 

CM r** 
a> 

CO 

O 



' CM 



00 
CM 
CM 



CM 



lO CD ^ O 

f*** CD CO CO 

^ ^ -S* -S- 



o 

CO 

Zl^ O 

= = :; = : ^ 

I I I I I I 

i I I s i = 

CM CO O Tj- CO 

(O CO CO CO 

<j> a> o5 o> o o 

CO CO CO 20 o <7> 

O O O O O O 



17 



EP 0 532 167 A2 




o 
o 
o 



00 
CM 

T 



= = T 
= = z ^ 

' CO 

1- in *- 

CM CM in 
CO CO CO 



o 



^ ^ 
o o o 



o 
o 



o 
o 
o 



CvJ 



o 
in 
4fc 



CO 

o 

CO 



^ CO 
: eg 

: 00 



; = = CM 
- CO 

CO OJ CO 

T- ir> in 
CO 

CD (T> 0> 

o o o 



o 
o 



(X> 

T 



CM 
CO 
=tt: 



CO 
CO 



f<J> 



= = = o 

: E 5 CO 
s I I 

i I I Irtllliiilii^ 

I I i'^ili!ii = l = llll 
cnooTf l = ^ = liilil = = l 

r-CDCO = = !! = = !! = ;:: = 
COCOI^ ::is = ;sr: = r = = 

?JJ?5^ r-c3>^Ttr^O'r-incoo>tnco cd^ 
mtnmmmmmmcDcococDcD 
OOOOOOOOOOOOO 



18 



EP 0 532 167 A2 



g. 5 



-I 1 (U), 

8000 9000 

#80 ► M#60 
#97 ► 

#90 ► 

^^^^Q MHniuiiiaiiiiiiUKtMi «i*wit»iniiiiM>ii ^938^ 

C9234 • " " 

nMIIIillllHlltllHi«ntM«lll«MH)tllMHIIlIII»><I 

8324 9261 



CI 0311 
CI 031 3 
C10314 
CI 0320 
CI 0322 
C10323 
CI 0326 
C10328 
CI 0330 
CI 0333 
C10334 
CI 0336 
CI 0337 
C10345 
C10346 
CI 0347 
C10349 
CI 0350 
CI 0357 • 

9221 



MlltlHfllttlHMHHKIIH* 



■flltlllllt)Hnilllllt«MM* 
■ ■MlinilltlllltlltllMllMI 



■Ml HMIHMHIIIIMI 



ttmttlKMHIMIHHflNttI 



limi*MIHMIIII*l»NIIIII 
tlllllMUtllH 



19 



EP 0 532 167 A2 




20 



EP 0 532 167 A2 



Sequence ID No.l 

Sequence Length: 9,589 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



ACCCGCCCCU 


AAUAGGGGCG 


ACACUCCGCC 


AU6AACCACU 


CCCCUGUGAG 


GAACUACUGU 


60 


CUUCACGCAG 


AAAGCGUCUA 


GCCAUGGCGU 


UAGUAUGAGU 


GUCGUACAGC 


CUCCAGGCCC 


120 


CCCCCUCCCG 


GGAGAGCCAU 


AGUGGliCUGC 


GGAACCGGUG 


AGUACACCGG 


AAUUGCCGGG 


180 


AAGACUGGGU 


CCUUUCUUG6 


AUAAACCCAC 


UCUAUGCCCG 


GUCAUUUGGG 


CGUGCCCCCG 


240 


CAAGACUGCU 


AGCCGAGUA6 


CGUUGGGUUG 


CGAAAGGCCU 


UGUGGUACUG 


CCUGAUAGGG 


300 


UGCUUGCGAG 


UGCCCCGGGA 


GGUCUCGUAG 


ACCGUGCACC 


AUGAGCACAA 


AUCCUAAACC 


360 


UCAAAGAAAA 


ACCAAAAGAA 


ACACCAACCG 


UCGCCCACAA 


GACGUUAA6U 


UUCCGGGCGG 


420 


CGGCCAGAUC 


GUUGGCGGAG 


UAUACUUGUU 


GCCGCGCAGG 


GGCCCCAGGU 


UGGGUGUGCG 


480 


CGCGACAAGG 


AAGACUUCGG 


AGCGGUCCCA 


6CCACGU6GA 


AGGCGCCAGC 


CCAUCCCUAA 


540 


GGAOCGGCGC 


UCCACUGGCA 


AAiiCCUGGGG 


AAAACCAGGA 


UACCCCU6GC 


CCCUAUACGG 


600 


GAAUGAGGGA 


CUCGGCUGGG 


CAGGAUGGCU 


CCUGUCCCCC 


CGA6GUUCCC 


GUCCCUCUUG 


660 


GGGCCCCAAU 


GACCCCCGGC 


AUAGGUCCCG 


CAACGU6GGU 


AAGGUCAUCG 


AUACCCUAAC 


720 


GUGCGGCUUU 


GCCGACCUCA 


UGGGGUACAU 


CCCUGUCGUA 


GGCGCCCCGC 


UCGGCGGCGU 


780 


CGCCAGAGCU 


CUC6CGCAUG 


GCGUGAGAGU 


CCUGGAGGAC 


GGGGUUAAUU 


UUGCAACAGG 


840 


GAACUUACCC 


GGUUGCUCCU 


UUUCUAUCUU 


CUUGCU6GCC 


CUGCU6UCCU 


6CAUCACCAC 


900 


CCCGGUCUCC 


GCUGCCGAAG 


UGAAGAACAU 


CAGUACCGGC 


UACAUGGUGA 


CCAAC6ACUG 


960 


CACCAAUGAU 


AGCAUUACCU 


GGCAACUCCA 


GGCUGCUGUC 


CUCCACGUCC 


CCGG6U6CGU 1020 


CCCGUGC6AG 


AAAGUGGGGA 


AUACAUCUCG 


GIJGCUGGAUA 


CCGGUCUCAC 


CGAAUGUGGC 1080 


CGUGCAGCAG 


CCCG6CGCCC 


UCACGCAGGG 


CUUACGGACG 


CACAUUGACA 


UGGUUGUGAU 1140 
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GUCCGCCACG 


CUCUGCUCCG 


CUCUUUACGU 


GGGG6ACCUC 


UGCGGUGGGG 


UGAUGCUU6C 


1200 


AGCCCAGAUG 


UUCAUUGUCU 


CGCCACAGCA 


CCACUGGUUU 


GUGCAAGACU 


GCAAUUGCUC 


1260 


CAUCUACCCU 


GGUACCAUCA 


CUGGACACCG 


CAIiGGCGDGG 


GACAUGAliGA 


UGAACUGGUC 


1320 


GCCCACGGCU 


ACCAUGAUCC 


UGGCGUACGC 


GAUGCGCGUC 


CCCGAGGUCA 


UCAUAGACAU 


1380 


CAUUGGCGGG 


GCUCAUUGGG 


GCGUCAUGUU 


CGGCUUAGCC 


UACUUCUCUA 


UGCAGGGAGC 


1440 


GUGGGCAAAA 


GUCGUUGUCA 


UUCUUUUGCU 


GGCCGCCGGG 


GUGGACGCGC 


AAACCCAUAC 


1500 


CGUUGGGGGU 


UCUACCGCGC 


AUAACGCCAG 


GACCCUCACC 


GGCAUGUliCU 


CCCUUGGD6C 


1560 


CAGGCAGAAA 


AUCCAGCUCA 


UCAACACCAA 


UGGCAGUUGG 


CACAUCAACC 


GCACCGCCCU 


1620 


GAACUGCAAU 


GACUCUUUGC 


ACACCGGCUU 


CCUCGCGUCA 


CUGUUCUACA 


CCCACAGCUU 


1680 


CAACUCGUCA 


GGAUGUCCCG 


AACGCAUGUC 


CGCCUGCCGC 


AGUAUCGAGG 


CCUUUCGGGU 


1740 


GGGAUGGGGC 


GCCUUACAAU 


AUGAGGACAA 


UGUCACCAAU 


CCAGAGGAUA 


UGAGACCGUA 


1800 


UUGCUGGCAC 


UACCCACCAA 


GACAGUGUGG 


UGUAGUCUCC 


GCGAGCUCUG 


UGUGUGGCCC 


1860 


AGUGUACUGU 


UUCACCCCCA 


GCCCAGUAGU 


AGUGGGUACG 


ACCGAUAGAC 


UUGGAGCGCC 


1920 


CACUUACACG 


UGGGGGGAGA 


AUGAGACA6A 


UGUCUUCCUA 


UUGAACAGCA 


CUC6ACCACC 


1980 


GCAGGGGUCA 


UGGUUCG6CU 


GCACGUGGAU 


GAACUCCACU 


GGCUACACCA 


AGACUUGCGG 


2040 


CGCACCACCC 


UGCCGCAUUA 


GAGCUGACUU 


CAAUGCCAGC 


AUGGACUUGU 


UGUGCCCCAC 


2100 


GGACUGliUUU 


AGGAAGCAUC 


CUGAUACCAC 


CUACAUCAAA 


UGUGGCUCUG 


GGCCCUGGCU 


2160 


CACGCCAAGG 


UGCCUGAUCG 


ACUACCCCUA 


CAGGCUCUGG 


CAUUACCCCU 


GCACAGUUAA 


2220 


CUAUACCAUC 


UUCAAAAUAA 


GGAUGUAUGU 


GGGGGGGGUC 


GAGCACAGGC 


UCACGGCUGC 


2280 


GUGCAAUUUC 


ACUCGUGGGG 


AUCGUUGCAA 


CUUGGAGGAC 


AGAGACAGAA 


GUCAACUGUC 


2340 


liCCUUUGCliG 


CACUCCACCA 


CGGAGUGGGC 


CAUUUUACCU 


UGCACUUACU 


CGGACCUGCC 


2400 


CGCCUU6UC6 


ACliGGUCUUC 


UCCACCliCCA 


CCAAAACAUC 


GUGGACGIIGC 


AAUUCAUGUA 


2460 


UGGCCUAUCA 


CCliGCUCUCA 


CAAAAUACAU 


CGUCCGAUGG 


GAGUGGGUAG 


UACUCUUAUU 


2520 


CCUGCUCUUA 


GCGGACGCCA 


GGGUUUGCGC 


CUGCUUAUGG 


AUGCUCAUCU 


UGUUGGGCCA 


2580 


GGCCGAAGCA 


GCACUAGAGA 


AGUUGGUCGU 


CUUGCACGCU 


GCGAGCGCAG 


CUAGCUGCAA 


2640 


UGGCUUCCUA 


UACUUUGUCA 


UCUUUUUCGU 


GGCUGCUUGG 


UACAUCAAGG 


GUCGGGUAGU 


2700 


CCCCUUGGCU 


ACUUAUUCCC 


UCACUGGCCU 


AUGGUCCUUU 


GGCCUACUGC 


UCCUAGCAUU 


2760 


GCCCCAACAG 


GCUUAUGCUU 


AUGACGCAUC 


UGUACAUGGU 


CAGAUAGGAG 


CAGCUCUGUU 


2820 


GGUACUGAUC 


ACUCUCUUUA 


CACUCACCCC 


CGGGUAUAAG 


ACCCUUCUCA 


GCCGGUUUCU 


2880 
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iinf'iiAiiriiMr 


iifiArmiAfip 


/^r^AAmiAiir" 

uuAAbLUAUu 


uULUAuuAuU 


uubUAULAl/i/ 






l/UV/UuUuUl/l/ 


niiAAiinnnAii 

UUuHUUUUHU 


AAllAll^^fiPP 
l/AUAUuuulrU 




Ul/UblrUOubb 


1AAA 


iiniiAniiniiitii 

UUUUUUUUUU 




Afiiififtriiriiii 

nUUuUvUV/UU 






AiiriipriiAAA 




AuUUulrUUUU 








f^f'iiriiAriiAA 


UUMUuUuUHv/ 












^iiifiriiAniAft 
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i/V\/UUuuUAu 


^1ftA 


ftiiAnAf^iift^ir 
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Aii^^ArrArni 




IIPf^ftAUHftAfi 
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19 A A 


IrtUuv/UUuAU 




Ul/uUuuAuUU 
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AHA A AA AAf^lf 
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V/AuLUul/UUu 


UbuuuAUAUV/ 






OOOU 








OUUAuv/UliAU 




vl/AAuuubUu 




unuUl/UUv/Ui/ 


ul/Ui/i/UAuV/A 


UUuUUUA Uuv/ 


OV/Aul/AuAV/H 




11^^^/^ A A 11 
uuub(/A^/l#AU 




auiiuuUuMuO 


AJiPAPrr'^r/^ 

AuuAl/UUUuU 


uv/uAv/AAuAO 


ATA A/^AT/i/^/^ 
AuAAuAbblflr 


uuuuAuAUuO 


Aubuv/OUbUO 


o04v/ 
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UUUAUUUUAb 




UuUUuAuuub 


uAuUUAuUuu 


ubUbblrUtAb 




A A^AA AUnill 
At-UAAAUL-uU 


UbuAuUUbUb 


^70 A 


f^Arrdnif'^iA 
UALuUuUuuA 




UAUAl#l/UbliU 


f^Arr^rpAA Af^ 
tAUuLuAAAU 
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UUUbbL^^Au 


Arnr^rA/^/ini 

AbuUUAl/uV/U 
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1QAA 
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CPA A/ilir^A II 
l/UAAuUi/UAU 


AT AlltllltIA lir^ 
AbAUUUUAUv 


UbUbUUuAuA 


PAniiifiArAii 

IrAUUUbAUAU 


'IQfiA 

oyov 


rnii^^A^^iir^^ 

V/UUUAUUv/Uli 
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IIIIA^II^ArA A 
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OAulf Al/AlfUA 
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rrrA AAf'iiiiA 
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AA9A 
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AAuUul/UAuU 


miiiAAiirrr 

blrUUAftUv/l/V/ 


UkfUbUuvsl/UU 




A 1 AA 
4 14V/ 


GUUUGGGGCG 


UACUUGUCCA 


AGGCACAUGG 


CAUCAAUCCC 


AACAIIUAGGA 


CUGGGGUCAG 


4200 


GACU6UGACG 


ACCGGGGCGC 


CCAUCACGUA 


CUCCACAUAU 


GGCAAAUUCC 


UCGCCGAUGG 


4260 


GGGCUGCGCA 


6GCGGCGCCU 


AUGACAUCAU 


CAUAUGCGAU 


GAAUGCCAUG 


GCGU6GACUC 


4320 


UACCACCAUU 


CUCGGCAUCG 


GAACAGUCCU 


CGAUCAAGCA 


GAGACAGCCG 


GGGUCAG6CU 


4380 


AACUGUACUG 


GCUACGGCUA 


CGCCCCCCGG 


GUCAGU6ACA 


ACCCCCCACC 


CCAACAUAGA 


4440 


GGAGGUGGCC 


CUCGGGCAGG 


AGGGUGAGAU 


CCCCUUCUAU 


GGGAGGGCGA 


UUCCCCUGUC 


4500 


AUACAIICAAG 


GGAGGAAGAC 


ACUUGAUCUU 


CQGCCACUCA 


AAGAAAAAGU 


GUGACGAGCU 


4560 


C6CGGC6GCC 


CtiUCGGGGUA 


U6GGCUU6AA 


CGCA6UGGCA 


UACUACAGAG 


GGCUGGACGU 


4620 
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l/UOtuUAAUA 


LrUAALUUAuu 


A A AA AAII AAII 

bAbAbbllAbU 


AA IIA AM AAAA 

bbUbbUUbbb 


A AAAA AAAAA 

AbbbAbbbbb 


MAAIIAA AAAA 
UbAUuAbbbb 


4oou 


uUUUAl/UliuA 


uAlfUUUuAUU 


bbuUbAUbbA 


AIIAA A A AAIIA 

bUubAAbbUA 


AAA/^lirA niP 

bbbbUbAbUb 


A A AMIIAIIA AA 

AAbUUbUAbA 


4/41/ 


l/UUUAliUUUu 


uAl/lrl/l/AUAU 


llf^Ar/^AIIA AT 
UbAbbAUAAb 


bAV/AbAbAbU 


blibbbUbAAb 


AAAAIIAIIAHA 
AbbbubUbUb 


A QAA 


AtiaUAul/UAu 




bbAtbbbLAb 


AAAA AAA AHA 

bbbAAbAbUb 


AAM A llllll A 11 A 

bbUAUUUAUA 


AAllA llAiltlMA 
bbUAUbUUUb- 


A OCA 


IrAi/UUuUbAb 


AA A AAAIIAAA 

LUAutl/Ul/Au 


AAA IIAMMlir^A 

bAAUbUUUbA 


/^A Aitr^itAAiir^ 
bAbUbuAbUb 


bUbUbbbAbU 


AAIIA AAAIIAA 

bbUAbbAUbb 




A/^r^rr^A/^f^p A 
AubuuUl/liLA 


UubUAUuAul/ 


UbAbAl/bAbl 


A/^ A A/^Arp 
bbAbAbbAbb 


AIIAA AAAIIAA 

bUbAbbb UbA 


AAAAAII AIIIIII 

bAbbAUAUUli 


A ADA 


l/AALAv/Al/UU 


AA|I|I||AAAIIA 


IIAIIAAAA AAA 

UbUbbbAAbA 


AAA JIAIIIIA AA 

bbAUbUubAb 


UuUUbbbAbb 


AA AIIIIIIIIA AA 

bAbUUUubAb 


C Ail A 


UuuLl/Ul/Al/A 


AA AA IfAPA 

l/AtAUAuAUu 


UbbAV/UUbbU 


ffilA/^AA A ATA 

UUbbbAAAbA 


AAbbAAUbbb 


AAAA AA Aflftfl 

bbbAAAAUUU 


C 1 AA 


AAAA ft AAllIf A 


A/* A/*^/^flA 

Av/AutOUAl/U 


AbbbUAbAbU 


AIIAAAAIIA/*A 

bUbbbbUAbU 


bbbAAAbbbb 


bbbbbbbbUb 


C 1C A 


(/UubGAOuUO 


AIIAMAAA A Afl 
AUtlUGGAAtlU 


AfflJIfA A AIIAA 

bUUubACU(/b 


A A IIA A A AAAA 

AbUCAAbbbb 


A AA AIIAAflAA 

AbAbUbbllbb 


AAAAAA A A AA 

bbbbbAbACb 


oZZO 


tiCuCCuCiuAO 


CutuuU(iu(/U 


AIIAltltA A/\/l A 

OubUuACOAA 


AAAAAIIAA AA 

bbAbbUbAbb 


bUbAbbbAUb 


AIIAMAA AAA A 

bububAbbAA 


coo A 

0280 


AUAl,AUOti(/{/ 


A/^ni/^AAiirr 
ACLUuCAUut 


AAbbCbAbbi/ 


IIA AAAIIAA IIA 

UbAbbubAl/b 


A AAA A /^ni 

AbbAbbAbbu 


AAAIIAIIIIA AA 

bbbtibt/UAbb 


0340 


UUliUuUUtilK/ 


IIIIAAAAAAAA 


MAAAAAAAII A 


AMAAAIIAAAA 

OUbbbUbbbb 


A AAAAAIIAI lA 

AbbbbbUuub 


lltlllAAA IIA A II 

UUUbbAUbAU 


C AAA 

5400 




LAtbUUAALt/ 


AbtbAblbbU 


AAIIIIAA A AAA 

bbUUbbAClG 


AAAAA AAAAA 

UAbAAbbAbb 


IIAAIIAII A tIAA 

ubbl/bUAUbA 


C AHA 

04b0 


AAA||II1|I|AA If 


A A A IIAAA AA 

bAbAUbbAbu 


A A IIAIIAAAIIA 

AAUbUbbCul 


IIAAA AAAA All 

UAGAbbGGtU 


A 1 lA A 1 1 1 1 A A A A 

bUCAUubAAb 


A AAAAA A AAA 

AbubbbAbbb 


t CO A 


A A II A AAAAA A 


A MAAIIA A A All 

AUbOUbAAbU 


AAA AAA MAA A 

bCAAGAuCCA 


A AAA III 1 A 1 1 1 lA 

AGGCUUAUUG 


A A AAA A AAI III 

CAGCAAGCUU 


A A AA AAAAAA 

CCAAACAA6C 


5580 


iiAA A/^ A/^ A II A 

UtAAuAl/AuA 


tAAbbbbbUb 


IIAAA AAAIIIIA 

UbCAbGCUUC 


IIIIAAAAAA AA 

UUGGCCCAAG 


AIIAAAAAA A II 

GUAGAGCAAU 


IIAIIAAA AA A A 

UCubGGbbAA 


n /^ A A 

5640 


A A AAA IIAIIAA 

AtAOAUuubti 


A A AIIIIA A IIA A 

AAtUulAUCA 


A AAAA A IIIIAA 


A II A AAIIAAA A 

AuACCuCGCA 


AAA All AIIAA A 

GGACUAUCAA 


A A All AAA A AA 

CACUGCCAGG 


C TAA 

5700 


A A A AAAIIAAII 


AHA AAdllAA A 

bUAblfUUbbA 




A A AIIAAAAAA 

CAGUGCCGCC 


AIIAA AAA AHA 

CUCACCAGUC 


AAlttlAIIA A A A 

bGUUGUCAAC 


t TCA 

5/60 


11 AAAA AA A All 


A iirr^Mii^^iirA 


AUAUUUubtiG 


AAAAIIAAAMA 

GGGCUGGCUA 


AAA IIAAA A A A 

GCAUCCbAAA 


IMIAAAAAIIAA 

UUubGbbUCC 


C OOA 

5820 




AbbbbbUUbu 


uObUOAtiUtiG 


AAIIA AIIAA AA 

CCUGGUGGGG 


A All A AAA It A A 

GCuGCCGUAG 


A A AAAA IIAAA 

GCAGCAUAGG 


C 0 0 A 


CUUGGGUAAG 


GUGCUGGUGG 


ACAUCCUGGC 


AGGGUAUGGU 


GCGGGCAUUU 


CGGGGGCUCU 


5940 


CGUC6CAUUC 


AAGAUCAUGU 


CUGGCGAGAA 


GCCCUCCAUG 


GAGGAUGUUG 


UCAACCUGCU 


5000 


6CCUGGAAUU 


CUGUCUCCG6 


GUGCCCUGGU 


GGUGGGAGUC 


AUCUGCGCGG 


CCAUCCU6CG 


6060 


CCGACACGU6 


GGACCGGGGG 


AAGGCGCUGU 


CCAAUGGAUG 


AAUAGGCUCA 


UUGCCUUUGC 


6120 


UUCCA6A66A 


AACCACGUCG 


CCCCCACCCA 


CUACGUGACG 


GAGUCGGAUG 


CGUCGCAGCG 


6180 


UGUGACCCAA 


CUACUUGGCU 


CCCUUACCAU 


AACCAGCCUG 


CUCAGGAGAC 


UCCACAACUG 


6240 


GAUQACUGAA 


GACUGCCCCA 


UCCCAUGCAG 


CGGCUCGUGG 


CUCC6CGAUG 


UGUGGGAUUG 


6300 


GGUUUGCACC 


AUCCUAACAG 


ACUUUAAAAA 


CU6GCUGACC 


(iCCAAAUUGU 


UCCCAAAGAU 


6360 
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GCCUGGUCUC 


CCCUUUAUCU 


CUUGUCAAAA GGGGUACAAG 


GGCGUGUGGG 


CUGGCACUGG 


6420 


UAUCAUGACC 


ACACGGUGUC 


CUUGCGGCGC CAAUAUCUCU 


GGCAAUGUCC 


GCCUGGGCUC 


6480 


CAUGAGAAUU 


ACGGGGCCCA 


AAACCUGCAU GAAUAUCliGG 


CAGGGGACCU 


UUCCCAUCAA 


6540 


UUGUUACACG 


GAGGGCCAGU 


GCGUGCCGAA ACCC6CACCA 


AACUUUAAGA 


UCGCCAUCUG. 


6600 


GAGGGUGGCG 


GCCUCAGAGU 


ACGCGGAGGU GACGCAGCAC 


GGGUCAUACC 


ACUACAUAAC 


6660 


AGGACUUACC 


ACUGAUAACU 


UGAAAGUUCC UUGCCAACUA 


CCUUCUCCAG 


AGUUCUUUUC 


6720 


CUGGGUGGAC 


GGAGUGCAGA 


UCCAUAGGUU UGCCCCCAUA 


CCGAAGCCGU 


UUUUUCGGGA 


6780 


UGAGGUCUCG 


UUCUGCGUUG 


GGCUUAAUUC AUUUGUCGUC 


GGGUCUCAGC 


UCCCUUGCGA 


6840 


UCCUGAACCU 


GACACAGACG 


UAUUGACGUC CAUGCUAACA 


GACCCAUCCC 


AUAUCACGGC 


6900 


GGAGACUGCA 


GCGCGGCGUU 


UGGCACGGGG GUCACCCCCG 


UCCGAGGCAA 


GCUCCUCAGC 


6960 


GAGCCAGCUA 


UCGGCACCAU 


CGCUGC6AGC CACCU6CACC 


ACCCACGGCA 


AGGCCUAUGA 


7020 


UGUGGACAUG 


GUGGAUGCCA 


ACCUGUUCAU GGGGGGCGAU 


GUGACCCGGA 


UAGAGUCUGA 


7080 


GUCCAAAGUG 


GUCGUUCUGG 


ACUCUCUC6A CCCAAUGGUC 


GAAGAAAGGA 


GCGACCUUGA 


7140 


GCCUUCGAUA 


CCAUCGGAAU 


AliAUGCUCCC CAAGAAGAGA 


UUCCCACCAG 


CCUUACCGGC 


7200 


UUGGGCACGG 


CCUGAUUACA 


ACCCACCGCU UGUGGAAUCG 


UGGAAGAGGC 


CAGAUUACCA 


7260 


ACCGGCCACU 


GUUGCGGGCU 


GCGCUCUCCC CCCCCCUAAG 


AAAACCCCGA 


CGCCUCCCCC 


7320 


AA66AGAC6C 


CGGACAGUGG 


GUCUGAGUGA GAGCUCCAUA 


GCAGAUGCCC 


UACAACAGCU 


7380 


GGCCAUCAAG 


UCCUUUGGCC 


AGCCCCCCCC AAGCGGCGAU 


UCAGGCCUUU 


CCACGGGGGC 


7440 


GGACGCAGCC 


GAUUCCGGCA 


GUCGGACGCC CCCCGAUGAG 


UUGGCCCUUU 


CGGAGACAGG 


7500 


JUCCAUCUCC 


UCCAUGCCCC 


CUCUCGAGGG GGAGCCUG6A 


GAUCCAGACU 


UGGAGCCUGA 


7560 


GCAGGUAGAG 


CUUCAACCUC 


CCCCCCAGGG GGGGGUGGUA 


ACCCCCGGCU 


CAGGCUCGGG 


7620 


GUCUUGGUCU 


ACUUGCUCCG 


AGGAGGACGA CUCCGUCGUG 


UGCUGCUCCA 


UGUCAUACUC 


7680 


CUGGACCGGG 


GCUCUAAUAA 


CUCCUUGUAG CCCCGAAGAG 


GAAAAGUUGC 


CAAUUGGCCC 


7740 


CUU6AGCAAC 


UCCCUGUUGC 


GAUAUCACAA CAAGGUGUAC 


U6UACCACAU 


CAAAGAGC6C 


7800 


CUCAUUAAGG 


GCUAAAAAGG 


UAACUUUUGA UAGGAUGCAA 


GCGCUCGACG 


CUCAUUAUGA 


7860 


CUCAGUCUUG 


AAQGACAUUA 


AGCUAGCGGC CUCCAAGGUC 


ACCGCAAGGC 


UUCUCACUUU 


7920 


AGAGGAGGCC 


UGCCAGUUAA 


CUCCACCCCA CUCUGCAAGA 


UCCAAGUAUG 


GGUUUGGGGC 


7980 


UAAGGAG6UC 


CGCAGCUUGU 


CCGGGA6AGC C6UUAACCAC 


AUCAA6UCCG 


UGUGGAAGGA 


8040 


CCUCCUGGAA 


GACACACAAA 


CACCAAUUCC UACAACCAUC 


AUGGCCAAAA 


AUGAGGUGUU 


8100 
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CUGCGUGGAC 


CCCACCAAGG 


GGGGUAAGAA 


A6CAGCUCGC 


CUUAUCGUUU 


ACCCUGACCU 


8160 


CG6CGUCA6G 


GUCUGCGAGA 


AAAUGGCCCU 


UUAUGAUAUC 


ACACAAAAGC 


UUCCUCAG6C 


8220 


GGUGAUGGGG 


GCUUCUUAUG 


GAUUCCAGUA 


CUCCCCCGCU 


CAGCGGGUGG 


AGUUUCUCUU 


8280 


GAAGGCAUGG 


GCGGAAAAGA 


AAGACCCUAU 


GGGUUUUUCG 


UAUGAUACCC 


GAUGCUUUGA 


8340 


CUCAACCGUC 


ACUGAGAGAG 


ACAUCAGGAC 


UGAGGAGUCC 


AUAUAUCGGG 


CUUGUUCCUU 


8400 


GCCCGAGGAG 


GCCCACACUG 


CCAUACACUC 


ACUGACUGAG 


AGACUUUACG 


UGGGAGGGCC 


8460 


CAUGUUCAAC 


AGCAAG6GCC 


AGACCUGCGG 


GUACAGGCGU 


UGCCGCGCCA 


GCGGGGUGCU 


8520 


UACCACUAGC 


AUGGGGAACA 


CCAUCACAU6 


CUAUGUGAAA 


GCCUUAGCGG 


CCUGUAAGGC 


8580 


UGCAGGGAUA 


AUUGC6CCCA 


CAAUGCUGGU 


AU6CGGCGAU 


GACUUGGUUG 


UCAUCUCA6A 


8640 


GAGCCAGGGG 


ACCGAGGAGG 


ACGAGCG6AA 


CCUGAGAGCC 


UUCACGGAGG 


CUAUGACCAG 


8700 


6UAUUCUGCC 


CCUCCUGGUG 


ACCCCCCCAG 


ACCGGAAUAU 


6ACCUGGAGC 


UGAUAACAUC 


8760 


UUGCUCCUCA 


AAUGUGUCUG 


UGGCGUUGGG 


CCCACAAGGC 


CGCCGCAGAU 


ACUACCUGAC 


8820 


CAGAGACCCU 


ACCACUCCAA 


UCGCCCGGGC 


UGCCUGGGAA 


ACAGUUAGAC 


ACUCCCCUGU 


8880 


CAAUUCAUGG 


CUAGGAAACA 


UCAUCCAGUA 


CGCCCCAACC 


AUAUGGGCUC 


GCAUGGUCCU 


8940 


GAUGACACAC 


UUCUUCUCCA 


UUCUCAUGGC 


CCAAGAUACU 


CUGGACCAGA 


ACCUCAACUU 


9000 


UuHUHUVlUni/ 


UUHUl/UUUUU 


Ml/uUl/UUuHU 






IIAAIIII/iAA An 




GUUACACGGG 


CUUGACGCUU 


UCUCUCUGCA 


CACAUACACU 


CCCCACGAAC 


UGACACGGGU 


9120 


6GCUUCAGCC 


CUCAGAAAAC 


UUGGGQCGCC 


ACCCCUCAGA 


GCGUGGAAGA 


GCCG6GCACG 


9180 


UGCAGUCAGG 


GCGUCCCUCA 


UCUCCCGUG6 


GGGGAGAGCG 


GCCGUUUGCG 


GCCGAUAUCU 


9240 


CUUCAACUGG 


GCGGUGAAGA 


CCAAGCUCAA 


ACUCACUCCA 


UUGCCGGAAG 


CGC6CCUCCU 


9300 


GGAUUUAUCC 


AGCUGGUUCA 


CUGUCGGCGC 


CGGCGGGGGC 


GACAUUUAUC 


ACAGCGUGUC 


9360 


GCGUGCCCGA 


CCCCGCUUAU 


UACUCCUUGG 


CCUACUCCUA 


CUUUUUGUAG 


GGGUAGGCCU 


9420 


UUUCCUACUC 


CCCGCUCGGU 


AGAGCGGCAC 


ACAUUAGCUA 


CACUCCAUAG 


CUAACU6UCC 


9480 


CUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


9540 


UDUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUU 


9589 
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Sequence ID No.2 
Sequence Length: 9,589 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ACCCGCCCCT 


AATAGGGGCG 


ACACTCCGCC 


ATGAACCACT 


CCCCTGTGAG 


GAACTACTGT 


60 


CTTCACGCAG 


AAAGCGTCTA 


6CCATGGC6T 


TAGTATGA6T 


GTCGTACAGC 


CTCCAGGCCC 


120 


CCCCCTCCCG 


GGAGAGCCAT 


AGTGGTCTGC 


GGAACCGGTG 


AGTACACCGG 


AATTGCCGGG 


180 


AAGACTG6GT 


CCTTTCTTG6 


ATAAACCCAC 


TCTATGCCCG 


GTCATTTGGG 

VI 1 vrl 1 1 1 UvlU 


CGTGCCCCCG 


240 


CAA6ACTGCT 


AGCCGAGTA6 


C6TTGGGTT6 


CGAAAGGCCT 


TGTGGTACT6 


CCTGATAGGG 


300 


TGCTTGC6AG 


TGCCCCG6GA 


GGTCTC6TA6 


ACCGTGCACC 


ATGAGCACAA 


ATCCTAAACC 


360 


TCAAAGAAAA 


ACCAAAAGAA 


ACACCAACCG 


TCGCCCACAA 


6ACGTTAAGT 


TTCCGGGCGG 


420 


CGGCCA6ATC 


GTTGGCGGAG 


TATACTTGTT 


GCCGCGCA66 


6GCCCCAGGT 


TGGGTGTGCG 


480 


CGCGACAAGG 


AA6ACTTC6G 


A6CGGTCCCA 


GCCACGTG6A 


AGGCGCCAGC 


CCATCCCTAA 


540 


GGATCGGCGC 


TCCACT66CA 


AATCCTGGGG 


AAAACCAGGA 


TACCCCTGGC 


CCCTATACGG 


600 


GAATGAGGGA 


CTCGGCT6GG 


CAG6ATGGCT 


CCTGTCCCCC 


CGAGGTTCCC 


GTCCCTCTT6 


660 


GGGCCCCAAT 


GACCCCC6GC 


ATAGGTCCCG 


CAACGTGG6T 


AAGGTCATCG 


ATACCCTAAC 


720 


6TGC6GCTTT 


GCC6ACCTCA 


TGGGGTACAT 


CCCTGTCGTA 


GGCGCCCCGC 


TCGGCGGCGT 


780 


CGCCAGAGCT 


CTCGCGCATG 


GCGTGA6AGT 


CCTGGAGGAC 


GGGGTTAATT 


TTGCAACAGG 


840 


GAACTTACCC 


6GTTGCTCCT 


TTTCTATCTT 


CTTGCTGGCC 


CTGCTGTCCT 


GCATCACCAC 


900 


CCCGGTCTCC 


GCTGCCGAAG 


T6AAGAACAT 


CAGTACCGGC 


TACATG6TGA 


CCAACGACTG 


960 


CACCAATGAT 


AGCATTACCT 


GGCAACTCCA 


GGCTGCTGTC 


CTCCACGTCC 


CCGGGTGCGT 1020 


CCCGT6CGA6 


AAAGTGGGGA 


ATACATCTCG 


GTGCTGGATA 


CCGGTCTCAC 


CGAATGTGGC 


1080 


CGT6CAGCAG 


CCCGGCGCCC 


TCACGCAGG6 


CTTACGGACG 


CACATTGACA 


TGGTTGTGAT 1140 


GTCCGCCACG 


CTCTGCTCCG 


CTCTTTACGT 


GGGGGACCTC 


TGCGGT6GGG 


TGATGCTTGC 


1200 
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AGCCCAGATG 


TTCATT6TCT 


CGCCACAGCA 


CCACTGGTTT 


GTGCAA6ACT 

Vi 1 v«\/i\ovAnv ■ 


GCAATTGCTC 

*4\/*»0 1 1 Vi V/ 1 \/ 


1260 


CATCTACCCT 


GGTACCATCA 


CTGGACACC6 


CAT6GCGTG6 

Vf\ 1 VIVIVVI 1 VIVI 


GACATGATGA 


TGAACT6GTC 


1320 


GCCCACGGCT 


ACCATGATCC 


TGGC6TACGC 


GATGCGCGTC 


CCCGA6GTCA 

v vr \J \Jk 1 1 Vi Vi 1 vrit 


TCATAGACAT 


1380 


CATTGGCGGG 


GCTCATTGGG 

UVr 1 VI* 1 f VI VI 


GCGTCATGTT 

VI W i V (1 ■ \4 i 1 


C6GCTTAGCC 

VVlViV 1 1 It Vi V/ V 


TACTTCTCTA 


TGCAGGGAGC 

i Vi vrci Vi VI viiAVaVr . 


1440 


GT6G6CAAAA 


6TC6TTGTCA 

VII\fVII IVII \/f « 


TTCTTTTGCT 

• I Vf 1 1 II VIV 1 


GGCCGCCGGG 

Vi VI \/ Vr Vi V VVIViVI 


GTGGACGCGC 


AAACCCATAC 


1500 


CGTTGGGGGT 


TCTACCGCGC 


ATAACGCCAG 


GACCCTCACC 


6GCATGTTCT 

^*«VI» I Vi % • V 1 


CCCTTG6TGC 

VVf V 1 t VI Vi 1 VI Vr 


1560 


CAGGCAGAAA 

\-f f\vi \jk\f rvvj rvn n 


ATCCAGCTCA 


TCAACACCAA 


TGGCAGTTGG 

1 VI ViV/ 1» v4 1 C UU 


CACATCAACC 


GCACCGCCCT 


1620 


GAACTGCAAT 


6ACTCTTT6C 


ACACCGGCTT 


CCTCGCGTCA 


CTGTTCTACA 

Vr i VI i lull 1 • 


CCCACAGCTT 


1680 


CAACTC6TCA 


6GAT6TCCCG 


AACGCATGTC 


CGCCTGCCGC 

VvlX/V 1 VlVVViV 


AGTATCGAG6 

M\ Vi 1 1 \ 1 \#VlitVIU 


CCTTTCGGGT 

\/\r III v/uviu 1 


1740 

1 1 *tu 


GGGATG6GGC 


6CCTTACAAT 


ATGAGGACAA 


TGTCACCAAT 

1 u 1 vrvv vnn i 


CCAGAGGATA 


TGAGACCGTA 

1 VI ff 1 Vi 11 Vf \r \A 1 il 


1800 


TTGCTGGCAC 


TACCCACCAA 


GACAGTGTGG 


TGTAGTCTCC 


GCGAGCTCTG 

U vVlfiViV 1 V/ 1 VI 


TGTGTGGCCC 

1 VI 1 vl 1 viviVrV/v 


1860 


AGTGTACTGT 


TTCACCCCCA 


GCCCAGTAGT 

VJvvvrWi 1 r\\J 1 


AGT6GGTAC6 


ACC6ATAGAC 


TTGGAGCGCC 

1 1 v4vir\viV/VJV/V/ 


1920 


CACTTACACG 


TGG6GGGAGA 


ATGAGACAGA 


TGTCTTCCTA 


TTGAACAGCA 


CTCGACCACC 


1980 


GCA6GGGTCA 


TGGTTC6GCT 


GCACGTGGAT 


GAACTCCACT 


G6CTACACCA 

vl VJ V/ 1 rivfi v\/fA 


AGACTTGCGG 

rWJrlV/ 1 1 UvViU 


2040 


C6CACCACCC 


TGCCGCATTA 


GAGCT6ACTT 


CAAT6CCA6C 


ATGGACTTGT 


TGTGCCCCAC 

1 VI 1 VI\/V/\/V/fA\/ 


2100 


GGACTGTTTT 


AGGAAGCATC 


CT6ATACCAC 


CTACATCAAA 


TGTGGCTCTG 


GGCCCTGGCT 


2160 


CACGCCAAGG 


TGCCTGATOG 

1 Uvv 1 Un 1 V/U 


ACTACCCrTA 


rAGGCTCTGG 


OATTACCCCT 


GCACAGTTAA 


2220 


CTATACCATC 


TTCAAAATAA 


GGATGTATGT 


GGGGGGGGTC 


GAGCACAGGC 

VJriViV/oV/rlVJviV/ 


TCACGGCTGC 


2280 


6T6CAATTTC 


ACTCGTGGGG 

I » v/ 1 \/ Vi 1 VI VI Vi Vi 


ATCGTT6CAA 


CTT6GAGGAC 


AGAGACAGAA 

i\ vifWir\\/r» vif» r\ 


GTCAACTGTC 

VI 1 vr r\i\V/ 1 vi 1 V/ 


2340 


TCCTTTGCTG 


CACTCCACCA 


CGGAGTGGGC 

Vf VI VI r>\4 1 viviUv 


CATTTTACCT 


TGCACTTACT 


CGGACCTGCC 

vvivinvv 1 vjv/ V/ 


2400 


C6CCTTGTCG 


ACTGGTCTTC 


TCCACCTCCA 


CCAAAACATC 


GTGGACGT6C 


AATTCATGTA 

no 1 1 vr» 1 VJ 1 n 


2460 


lu(it(rlAIOA 


(/(/ 1 ut 1 0 1 OA 


OAAAAIAOAI 


AATAAA A T AA 


GAuluuGlAG 


TA ATATTA TT 
lACICt lAI 1 




CCTGCTCTTA 


GCGGACGCCA 


GGGTTTGCGC 


CTGCTTATGG 


ATGCTCATCT 


TGTTG66CCA 


2580 


GGCCGAAGCA 


GCACTAGAGA 


A6TT6GTCGT 


CTT6CACGCT 


GCGAGCGCAG 


CTAGCTGCAA 


2640 


T6GCTTCCTA 


TACTTTGTCA 


TCTTTTTCGT 


GGCTGCTTGG 


TACATCAAGG 


GTCGGGTA6T 


2700 


CCCCTT6GCT 


ACTTATTCCC 


TCACTGGCCT 


ATG6TCCTTT 


GGCCTACTGC 


TCCTAGCATT 


2760 


GCCCCAACAG 


GCTTATGCTT 


ATGACGCATC 


TGTACATGGT 


CAGATAGGAG 


CAGCTCTGTT 


2820 


GGTACTGATC 


ACTCTCTTTA 


CACTCACCCC 


CG6GTATAAG 


ACCCTTCTCA 


GCCGGTTTCT 


2880 


GTGGT6GTT6 


TGCTATCTTC 


TGACCCTGGC 


GGAAGCTATG 


GTCCAGGAGT 


GGGCACCACC 


2940 



28 



EP 0 532 167 A2 



TATGCAGGTG 


CGCGGTGGCC 


GTGATGG6AT 


CATATGGGCC 


GTCGCCATAT 


TCTGCCCGGG 

1 Vr I VI vr v v w ** M 


3000 


TGTGGTGTTT 


GACATAACCA 


AGTGGCTCTT 


GGCG6TGCTT 


G6GCCTGCTT 


ATCTCCTAAA 


3060 


AG6TGCTTT6 


ACGCGTGTGC 


CGTACTTC6T 


CAGGGCTCAC 


GCTCTACTAA 


GGATGTGCAC 


3120 


CATGGTAA6G 


CATCTCGC6G 


GGGGTAGGTA 

Alt VfVtVI 1 I^^JU 1 11 


CGTCCAGATG 


GTGCTACTA6 


CCCTTGGCAG 


.3180 


GTGGACTGGC 


ACTTACATCT 


AT6ACCACCT 


CACCCCTATG 


TC6GATTGG6 


CT6CTAATGG 


3240 


OOTGCGGGAC 


TTGGCGGTCG 

1 1 VlVIVfUvi 1 V/U 


CC6TGGAGCC 


TATCATCTTC 


A6TCCGATGG 


AGAAAAAAGT 


3300 


CATCGTCTGG 


G6AGCGGAGA 


CAGCTGCTTG 


CGGGGATATC 

x/uuuun 1 n 1 V 


TTACACGGAC 


TTCCCGTGTC 


3360 


cGrrrGAOTT 

V/\iV/v\/Vlov i 1 


G6CCG6GAG6 

vJ u V/ v u u u n vi VI 


TCCTCCTT6G 


CCCAGCT6AT 

x/x/vnuv 1 un 1 


G6CTATACCT 


CCAAG66GT6 

vr V f 1 fl t \A \A vt Vf 1 vl 


3420 




GOrcrCATCA 


CTGCTTATGC 


CCAGCAGACA 


CGCGGCCTTT 

V/UVUUV/V 1 1 1 


T6GGCACCAT 

1 uuuvrnv/v/n i 


3480 

U*tV> u 


AGTfifiTfiAfir 

nU 1 VJU 1 UnUV/ 


ATGAOGGGGC 


GOGACAAGAC 


AGAACAGGCC 


GGGGAGATTC 

UUUUriUrv 1 1 Vr 


AGGTCCTGTC 

nuu 1 vr v/ i u i v 


3540 


trMV/UU 1 V/nV/ f 




TOGGAAPAAr 


rATOTOGGGG 

vf\ 1 V 1 vUUUU 


GTCTTATGGA 

u 1 Vr i 1 n [ uun 


CTGTCTACCA 

V/ 1 U 1 Vr 1 nV/V/ri 


3600 


1 UUMVJl/ 1 UUV/ 


AAPAAGAPTO 


TAGCCGGCTC 


ACGGGGTCCG 

nv/uuuu 1 \/V/U 


GTCACACAGA 


TGTACTCCAG 

1 u • nvr 1 V/Vrnu 


3660 

vFVrV/U 




GAOTTAGTGG 


GGTGGOrOAG 


rrcrrorGGG 

V/V/ V V/ V/V/V/UUU 


ACCAAATCTT 

r\\rV/orini Vrl 1 


TGGAGCCGTG 

1 uunuvrv/u 1 vl 


3720 


rAOGTGTGGA 


GCGGTCGACC 


TATACCTGGT 


CACGCGAAAC 


GCTGATGTCA 

UV/ 1 Ur\ 1 U 1 vn 


TCCCGGCTCG 

1 vrV/>yUUVr 1 vrU 


3780 


AAGAfGrGGG 


6ACAAGCGAG 


GAGCGCTACT 


CTCCCCGAGA 


CCTCTTTCCA 

v/v 1 V/ 1 1 1 vV/n 


CCTTGAAGG6 

V/Vr 1 1 Ur\r\UUU 


3840 


GTmCGGGG 


GGrCOGGTGr 


TCTGrrrrAG 


AGGrrAPGrT 

nVIUV/vnvUV/ 1 


GTCGGGGTCT 

U 1 VUUUU 1 V 1 


TCCGGGCAGC 

1 Vrl/UUUV/rlUV/ 


3900 


OGTGTGrTOr 


rGGGGCGTGG 


OCAAGTrOAT 


AGATTTTATG 

twin 1 1 1 1 n 1 V 


CCCGTTGAGA 

V/V/V/U 1 1 unun 


CACTTGACAT 

V#rlVr 1 1 U n vr rv 1 


3960 


CGTCACTCGG 


TCCCCCACCT 


TTAGTGACAA 


CAGCACACCA 


CCTGCTGTGC 

V/V I Uu ■ U 1 UV/ 


CCCAAACTTA 

vr vr VrO iVtV vr 1 in 


4020 


TOAGGTCGGG 


TACTTACATG 


CCCCGACTGG 

vv/v/vvinv 1 uu 


TAGTGGAAAG 

1 nu 1 uunnnu 


AGCACCAAAG 

n u vr n V/ V/ rv r\n u 


TCCCT6TCGC 

1 Vv/v 1 U 1 V/Uv 


4080 


GTATGCCGCT 


CAGGGGTACA 


AAGTGCTAGT 

rlr\vl 1 Uv 1 rVU 1 


GCTTAATCCC 

UV/ 1 1 f\t\ 1 V Wv/ 


TCGGTGGCTG 

1 VrUU 1 UUVr 1 U 


CCACCCTGGG 


4140 


GTTTGGGGCG 


TACTTGTCCA 


AGGCACATGG 


CATCAATCCC 

Ken 1 vt>»% 1 VW 


AACATTAG6A 


CTGGGGTCAG 


4200 


bAO 1 U i (lAl/U 


AOUUtibbl/tiO 


(/(/AllrACGI A 


(/lOUAOAIAI 


A/*/* A A A TTA/^ 

UbOAAAl 100 


1 OuOOuA 1 uu 


4^:00 


GGGCTGCGCA 


GGCGGCGCCT 


ATGACATCAT 


CATATGCGAT 


GAATGCCATG 


CCGTG6ACTC 


4320 


TACCACCATT 


CTCGGCATCG 


GAACAGTCCT 


C6ATCAA6CA 


GAGACAGCCG 


GGGTCAGGCT 


4380 


AACTGTACT6 


6CTAC66CTA 


CGCCCCCCGG 


GTCAGTGACA 


ACCCCCCACC 


CCAACATAGA 


4440 


GGAGGTGGCC 


CTCGGGCAG6 


AGGGTGAGAT 


CCCCTTCTAT 


GGGAGGGCGA 


TTCCCGTGTC 


4500 


ATACATCAAG 


GGAGGAAGAC 


ACTTGATCTT 


CTGCCACTCA 


AA6AAAAAGT 


6TGACGAGCT 


4560 


CGCGGCGGCC 


CTTCGG6GTA 


T6GGCTTGAA 


CGCAGTGGCA 


TACTACAGAG 


GGCTGGACGT 


4620 


CTCCGTAATA 


CCAACTCAGG 


GAGACGTA6T 


GGTCGTCGCC 


ACCGACGCCC 


TCAT6ACGGG 


4680 
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GTTTACTGGA 


GACTTTGACT 


CCGTGATC6A 


OTGOAArfiTA 

Vr 1 UV/rVMl/U 1 rt 




AMU 1 1 li 1 AuH 


A 7A(\ 


CTTCAGCTTG 


GACCCCACAT 


TCACCATAAC 

■ VI»W1* i t 111 V 


CACACAGACT 

vn v/r\ vfi urv V i 


GTrOPTOAAG 


nV/ul/ 1 u 1 U 1 1/ 


/I AAA 


ACGTAGCCAG 


CGCC6G6GCC 


GCACGGGCAG 


GGGAAGACTG 


GGTATTTATA 

uvi 1 n 1 1 1 n 1 n 


RR TATRTTTf^ 

UU t M 1 U 1 1 I Km 


40D\/ 


CACTG6TGAG 


CGAGCCTCAG 


GAATGTTTGA 


CAGTGTAGTG 


CTOTGCGAGT 


RriATRATRr 




AGGGGCCGCA 


T6GTATGAGC 


TCACACCAGC 


GGAGACCACC 


GTCAGGCTCA 


RARPATATTT 




CAACACACCT 


6GTTT6CCT6 


TGTGCCAAGA 


CCATCTTGAG 


TTTTGGGAGG 

1 1 1 1 uuunuu 


CAGTTTTrAr 




CG6CCTCACA 


CACATAGATG 


CCCACTTCCT 


TTCCCAAACA 


AAGCAATCGG 


GGGAAAATTT 

uuunnnn ill 




CGCATACTTA 


ACAGCCTACC 


AGGCTACAGT 


GTGCGCTAGG 


GCCAAA6CCC 


vvV/vvvV/U 1 V 


S1fiA 


CTGGGACGTC 


ATGT66AAGT 


6TTTGACTC6 


ACTCAAGCCC 


ACACTCGTGG 

nvflVr 1 vU 1 UU 


UV/V/V/vnl/nvV/ 




TCTCCTGTAC 


CGCTTGGGCT 


CTGTTACCAA 


CGAGGTCACC 




TTRTRArRAA 


fi9AA 


ATACATCGCC 


ACCTGCATGC 


AAGCCGACCT 


TGAGGTCATG 


ArCAGOArGT 


RRRTfTTARr 

uuu 1 V/ 1 1 nUV/ 


oO**v 


TG66GGG6TC 


TTGGC6GCCG 


TCGCCGCGTA 


rTfirriGGPG 

\f 1 UUV/ t UUV/U 


n\rV/UUU lulu 


TTTRPATfAT 




CGGCCGCTTG 


CAC6TTAACC 


AGCGAGCCGT 


OGTTfirArrfi 


GArAAGfiAGR 


TmrTATRA 


^4AA 


GGCTTTTGAT 


GAGAT6GAGG 


AATGTGCCTC 


TARARrfifirT 


rTCATTRAAR 


AftHRRrARrR 




GATAGCCGAG 


ATGCTGAA6T 


CCAAfiATCOA 


AGGrTTATTR 


TARrAARriT 




OOoU 


TCAAGACATA 


CAACCCGCTG 


TGCAGGCnr 


1 1 vJUV/vV/Mnvi 


RTARARPAAT 


1 V/ 1 uuu wHM 


0D4V/ 


ACACAT6TGG 


AACTTCATCA 


GCGGCATTCA 


ATAmrRPA 


RRATTATrAA 


TAf^TRrrARR 

1 Ulf l/HUU 


0 f V/v 


GAACCCTGCT 


GTAGCTTCCA 


TGATGGCATT 






rRTTRTPAAP 

\/U 1 1 u 1 OnHlr 


^7fiA 


TAGCACCACT 


ATCCTTCTCA 


ACATTTTGGG 


GGGPTfifiPTA 


RpATrrrAAA 


TTRrRrrTRP 

1 1 UUUV/V/ 1 


^A9A 


CGCGGGGGCT 


ACC6GCTTCG 


TCGTCAGTGG 


rOTftfiTRfiftft 

v\/ 1 UU 1 uuuu 


RriftrrRTAR 

M\f I UlrV/U 1 nU 


RrARPATARR 

Ul/HuV/M 1 nuU 


OooU 


CrrGGGTAAG 


GTGCTG6TGG 


ACATCCTGGC 


AGGGTATGGT 


GrGGRrATTT 

UV/UUUVrn 1 1 1 


RRRRRRRTRT 




CGTOGrATTr 

vu 1 \/U\/n 1 1 \f 






UV/UV/ 1 ULA 1 U 


b AuUA 1 u 1 1 u 


1 LAAtU 1 (lO 1 


C AAA 

6000 


GCCTGGAATT 


CTGTCTCCGG 


GTGCCCTGGT 


GGTGGGAGTC 


ATCTGCGCG6 


CCATCCTGCG 


6060 


CC6ACAC6TG 


GGACCGGGGG 


AAGGCGCTGT 


CCAATGGAT6 


AATAGGCTCA 


TTGCCTTT6C 


6120 


TTCCAGAGGA 


AACCACGTCG 


CCCCCACCCA 


CTACGTGAC6 


GA6TCGGATG 


CGTCGCAGCG 


6180 


TGTGACCCAA 


CTACTTGGCT 


CCCTTACCAT 


AACCA6CCTG 


CTCAGGAGAC 


TCCACAACTG 


6240 


GATTACTGAA 


GACTGCCCCA 


TCCCATGCAG 


C6GCTCGTGG 


CTCCGCGATG 


TGTGG6ATT6 


6300 


GGTTTGCACC 


ATCCTAACA6 


ACTTTAAAAA 


CT6GCT6ACC 


TCCAAATT6T 


TCCCAAAGAT 


6360 


GCCTGGTCTC 


CCCTTTATCT 


CTT6TCAAAA 


GGGGTACAAG 


GGC6TGTGGG 


CTGGCACTGG 


6420 
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TATCATGACC ACACGGTGTC CTTGCGGCGC 
CATGAGAATT ACGGGGCCCA AAACCTGCAT 
TTGTTACACG GAGG6CCAGT GCGT6CCGAA 
GAGGGTGGCG GCCTCAGAGT ACGCGGAGGT 
AGGACTTACC ACTGATAACT TGAAAGTTCC 
CTGGGTGGAC 6GA6T6CAGA TCCATAGGTT 
T6AGGTCTCG TTCTGCGTTG GGCTTAATTC 
TCCTGAACCT 6ACACAGACG TATT6ACGTC 
GGAGACTGCA GCGCGGCGTT TGGCACGGGG 
GAGCCA6CTA TCGGCACCAT CGCTGCGAGC 
TGTG6ACATG GTGGATGCCA ACCTGTTCAT 
GTCCAAAGTG GTCGTTCT6G ACTCTCTCGA 
GCCTTCGATA CCATCGGAAT ATATGCTCCC 
TTG6GCACGG CCTGATTACA ACCCACCGCT 
ACCGGCCACT GTTGCGGGCT GCGCTCTCCC 
AAGGAGACGC CGGACAGTGG GTCTGAGTGA 
GGCCATCAAG TCCTTTGGCC A6CCCCCCCC 
GGACGCAGCC GATTCCGGCA GTCGGACGCC 
TTCCATCTCC TCCATGCCCC CTCTCGAG6G 
GCAGGTAGA6 CTTCAACCTC CCCCCCAGGG 
GrCTTGGTCT ACTTGCTCCG AGGAG6ACGA 
CTGGACC6GG GCTCTAATAA CTCCTT6TAG 
CTT6A6CAAC TCCCTGTTGC GATATCACAA 
CTCATTAAGG GCTAAAAAGG TAACTTTT6A 
CTCAGTCTTG AAG6ACATTA AGCTAGCGGC 
AGAGGAGGCC TGCCAGTTAA CTCCACCCCA 
TAAGGAGGTC CGCAGCTTGT CCGGGAGAGC 
CCTCCTGGAA GACACACAAA CACCAATTCC 
CTGC6TGGAC CCGACCAAGG GGGGTAAGAA 



CAATATCTCT G6CAAT6TCC GCCTGGGCTC 6480 
6AATATCTGG CAGGGGACCT TTCCCATCAA 6540 
ACCCGCACCA AACTTTAAGA TCGCCATCTG 6600 
GACGCAGCAC GGGTCATACC ACTACATAAC 6660 
TTGCCAACTA CCTTCTCCAG AGTTCTTTTC 6720 
TGCCCCCATA CCGAAGCCGT TTTTTCGG6A 6780 
ATTTGTCGTC GGGTCTCAGC TCCCTTGGGA 6840 
CATGCTAACA GACCCATCCC ATATCACGGC 6900 
GTCACCCCCG TCCGAGGCAA GCTCCTCAGC 6960 
CACCTGCACC ACCCACG6CA AGGCCTATGA 7020 
G6GG66CGAT GTGACCCGGA TA6AGTCTGA 7080 
CCCAATGGTC GAAGAAAGGA 6CGACCTT6A 7140 
CAAGAAGAGA TTCCCACCAG CCTTACCGGC 7200 
TGTGGAATCG TGGAAGAG6C CAGATTACCA 7260 
CCCCCCTAAG AAAACCCCGA CGCCTCCCCC 7320 
GAGCTCCATA GCAGAT6CCC TACAACAGCT 7380 
AAGCGGCGAT TCAGGCCTTT CCACGGGGGC 7440 
CCCCGATGAG TTGGCCCTTT CGGAGACAGG 7500 
GGAGCCTGGA GATCCAGACT TGGAGCCTGA 7560 
GGGG6TGGTA ACCCCCGGCT CAGGCTCGGG 7620 
CTCCGTCGTG TGCTGCTCCA TGTCATACTC 7680 
CCCCGAAGAG 6AAAAGTTGC CAATTG6CCC 7740 
CAA6GTGTAC TGTACCACAT CAAAGAGOGC 7800 
TAGGATGCAA GCGCTCGACG CTCATTAT6A 7860 
CTCCAAGGTC ACCGCAAGGC TTCTCACTTT 7920 
CTCTGCAAGA TCCAAGTATG GGTTTGGGGC 7980 
C6TTAACCAC ATCAA6TCCG TGTGGAAGGA 8040 
TACAACCATC ATGGCCAAAA ATGAG6TGTT 8100 
AGCAGCTCGC CTTATCGTTT ACCCTGACCT 8160 
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CGGCGTCAGG GTCTGCGAGA AAATGGCCCT TTATGATATC ACACAAAAGC TTCCTCAGGC 8220 
G6TGATG66G 6CTTCTTAT6 GATTCCAGTA CTCCCCCGCT CAGCGGGT6G AGTTTCTCTT 8280 
6AAGGCATG6 GCGGAAAAGA AAGACCCTAT G6GTTTTTCG TATGATACCC GATGCTTTGA 8340 
CTCAACCGTC ACTGAGAGAG ACATCAGGAC TGAGGAGTCC ATATATCGGG CTTGTTCCTT. 8400 
GCCCGAGGAG GCCCACACTG CCATACACTC ACT6ACTGAG AGACTTTACG TGGGAGGGCC 8460 
CATGTTCAAC AGCAAGGGCC AGACCTGCGG GTACAG6CGT TGCCGCGCCA GCGG66TGCT 8520 
TACCACTAGC ATGGGGAACA CCATCACATG CTATGT6AAA GCCTTAGC6G CCTGTAAGGC 8580 
TGCAGGGATA ATTGCGCCCA CAATGCTGGT ATGCGGCGAT GACTTG6TTG TCATCTCAGA 8640 
6AGCCAGGGG ACCGAGGA6G ACGAGCG6AA CCTGAGAGCC TTCACG6AGG CTATGACCAG 8700 
GTATTCTGCC CCTCCTG6TG ACCCCCCCAG ACCGGAATAT GACCTGGAGC TGATAACATC 8760 
TTGCTCCTCA AATGTGTCTG TGGCGTT6G6 CCCACAAGGC C6CCGCA6AT ACTACCTGAC 8820 
CAGAGACCCT ACCACTCCAA TCGCCCGGGC TGCCTGGGAA ACAGTTAGAC ACTCCCCTGT 8880 
CAATTCATG6 CTAGGAAACA TCATCCA6TA CGCCCCAACC ATATGGGCTC GCATGGTCCT 8940 
GATGACACAC TTCTTCTCCA TTCTCATGGC CCAAGATACT CTG6ACCAGA ACCTCAACTT 9000 
TGAGATGTAC G6AGCGGTGT ACTCCGTGAG TCCCTTGGAC CTCCCAGCCA TAATTGAAAG 9060 
GTTACACGGG CJTGACGCTT TCTCTCTGCA CACATACACT CCCCAC6AAC T6ACACGGGT 9120 
GGCTTCAGCC CTCAGAAAAC TTGGGGCGCC ACCCCTCAGA GCGTGGAAGA GCCGGGCACG 9180 
TGCAGTCAGG GCGTCCCTCA TCTCCCGTGG GGGGAGAGCG GCCGTTTGCG GCCGATATCT 9240 
CTTCAACTG6 GCG6TGAAGA CCAAGCTCAA ACTCACTCCA TTGCCGGAAG CGCGCCTCCT 9300 
GGATTTATCC AGCTGGTTCA CTGTCGGCGC CGGCGGG6GC GACATTTATC ACAGCGTGTC 9360 
GC6TGCCCGA CCCCGCTTAT TACTCCTTGG CCTACTCCTA CTTTTT6TA6 GGGTAGGCCT 9420 
TTTCCTACTC CCCGCTCGGT AGA6C6GCAC ACATTAGCTA CACTCCATAG CTAACT6TCC 9480 
CTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT 9540 
TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTT 9589 
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Sequence ID No.3 
Sequence Length: 3,970 
Sequence Type; nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



A A/\ * TT A A Art 

GGCATTACCC 


AT A A A A A ATT 

CTGCACAGTT 


AAATATAAAA TATTAAAAAT 

AACTATACCA TCTTCAAAAI 


fl A AAA TATA T 

AAIiuAKjIAI 


GTGGGGGGGG 


60 


X AA A A A A A A A 

TCGAGCACAG 


AATA A A A A AT 

GCTCACGGCT 


AAATAAAATT TAAATAATAA 

6C6TGCAATT TCACTCGT6G 


AAA TAATTAA 

GGAIOGI IGC 


AACTTGGA6G 


120 


AAAAAAAAAA 

ACAGAGACAG 


A A AX A A A ATA 

AAGTCAACTG 


TATAATTTAA TAAAATAAAA 

TCTCCTTTGC TGCACFCCAC 


AAAAAAATAA 

CAOGbAO 1 lib 


6CCATTTTAC 


180 


CTTGCACTTA 


CTCG6ACCT6 


CCC6CCTT6T CGACTGGTCT 


TCTCCACCTC 


CACCAAAACA 


240 


TCGTGGACGT 


GCAATTCATG 


TAT66CCTAT CACCTGCTCT 


CACAAAATAC 


ATCGTCCGAT 


300 


GGGAGTGGGT 


AGTACTCTTA 


TTCCTGCTCT TAGC6GACGC 


CA6GGTTTGC 


GCCTGCTTAT 


360 


GGATGCTCAT 


CTTGTTGGGC 


CAGGCC6AAG CAGCACTAGA 


6AAGTTGGTC 


GTCTTGCACG 


420 


CTGC6AGCGC 


AGCTAGCTGC 


AATGGCTTCC TATACTTT6T 


CATCTTTTTC 


6TGGCTGCTT 


480 


GGTACATCAA 


GGGTCGGGTA 


GTCCCCTTGG CTACTTATTC 


CCTCACTGGC 


CTATGGTCCT 


540 


TTGGCCTACT 


GCTCCTAGCA 


TTGCCCCAAC AGGCTTATGC 


TTATGACGCA 


TCT6TACATG 


600 


GTCAGATAGG 


AGCAGCTCTG 


TTGGTACT6A TCACTCTCTT 


TACACTCACC 


CCCGGGTATA 


660 


AGACCCTTCT 


CAGCCGGTTT 


CTGTG6TGGT TGT6CTATCT 


TCTGACCCT6 


GCGGAAGCTA 


720 


TGGTCCAGGA 


GTGGGCACCA 


CCTATGCA6G TGCGCGGTGG 


CC6TGATGGG 


ATCATATGGG 


780 


CC6TCGCCAT 


ATTCTGCCC6 


66TGTGGT6T TTGACATAAC 


CAAGTGGCTC 


TTGGCGGTGC 


840 


TT6GGCCTGC 


TTATCTCCTA 


AAA66TGCTT TGACGCGTGT 


GCCGTACTTC 


GTCAG6GCTC 


900 


ACGCTCTACT 


AAGGATGTGC 


ACCAT6GTAA 66CATCTCGC 


GGGGGGTAGG 


TACGTCCAGA 


960 


TGGTGCTACT 


AGCCCTTG6C 


AGGTGGACTG 6CACTTACAT 


CTATGACCAC 


CTCACCCCTA 1020 


TGTCGGATTG 


GGCTGCTAAT 


GGCCTGCGGG ACTTGGCG6T 


CGCCGT6GAG 


CCTATCATCT 


1080 


TCAGTCC6AT 


GGAGAAAAAA 


GTCATCGTCT GGGGAGCG6A 


6ACAGCTGCT 


TGC6G6GATA 1140 


TCTTACACGG 


ACTTCCCGTG 


TCCGCCCGAC TTGGCCG66A 


GGTCCTCCTT 


GGCCCAGCTG 1200 
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ATGGCTATAC 


CTCCAAGGGG 


TG6A6TCTTC 


rCGCCCCCAT 


CACTGCTTAT 


GCCCAGCAGA 


1260 


CACGCGGCCT 


TTT6GGCACC 


ATAGTGGTGA 


GCATGACGGG 


GCGCGACAAG 


ACAGAACAGG 


1320 


CCGGG6AGAT 


TCAGGTCCTG 


TCCACGGTCA 


CTCAGTCCTT 


CCTCGGAACA 


ACCATCTCGG 


1380 


G6GTCTTATG 


GACTGTCTAC 


CATGGA6CTG 


GCAACAAGAC 


TCTA6CCG6C 


TCACGGGGTC 


U40 


CGGTCACACA 


GATGTACTCC 


AGTGCTGAGG 


G6GACTTA6T 


GG6GTGGCCC 


AGCCCCCCCG 


1500 


GGACCAAATC 


TTTGGAGCCG 


TGCACGTGTG 


GAGCG6TCGA 


CCTATACCTG 


GTCACGC6AA 


1560 


ACGCTGATGT 


CATCCCGGCT 


CGAAGACGCG 


GGGACAAGCG 


AGGAGCGCTA 


CTCTCCCCGA 


1620 


GACCTCTTTC 


CACCTTGAAG 


G6GTCCTCGG 


GGGGCCCGGT 


GCTCTGCCCC 


AGAGGCCACG 


1680 


CTGTCG6GGT 


CTTCCGGGCA 


GCCGTGTGCT 


CCCG6GGCGT 


GGCCAAGTCC 


ATAGATTTTA 


1740 


TCCCCGTTGA 


GACACTTGAC 


ATCGTCACTC 


GGTCCCCCAC 


CTTTAGTGAC 


AACAGCACAC 


1800 


CACCTGCTGT 


GCCCCAAACT 


TATCAGGTCG 


GGTACTTACA 


T6CCCCGACT 


GGTA6TGGAA 


1860 


AGA6CACCAA 


AGTCCCTGTC 


GC6TATGCCG 


CTCAGGGGTA 


CAAA6TGCTA 


GTGCTTAATC 


1920 


CCTC6GTGGC 


TGCCACCCTG 


GGGTTTGGGG 


CGTACTT6TC 


CAAGGCACAT 


GGCATCAATC 


1980 


CCAACATTAG 


GACTGG6GTC 


AGGACTGT6A 


CGACCGGGGC 


GCCCATCACG 


TACTCCACAT 


2040 


ATGGCAAATT 


CCTCGCCGAT 


GGGGGCTGCG 


CAGGCGGCGC 


CTATGACATC 


ATCATATGCG 


2100 


ATGAATGCCA 


TGCCGTGGAC 


TCTACCACCA 


TTCTCGGCAT 


CGGAACAGTC 


CTCGATCAAG 


2160 


CAGAGACAGC 


CGGG6TCAGG 


CTAACTGTAC 


T6GCTACG6C 


TAC6CCCCCC 


GGGTCAGTGA 


2220 


CAACCCCCCA 


CCCCAACATA 


GAGGAGGTGG 


CCCTCG6GCA 


GGAGGGTGAG 


ATCCCCTTCT 


2280 


ATGGGAGGGC 


GATTCCCCTG 


TCATACATCA 


AGGGAGGAAG 


ACACTT6ATC 


TTCTGCCACT 


2340 


CAAAGAAAAA 


GTGTGACGAG 


CTCGCGGCGG 


CCCTTCGGGG 


TATGGGCTTG 


AACGCAGTGG 


2400 


CATACTACA6 


AGGGCTGGAC 


GTCTCCGTAA 


TACCAACTCA 


GGGAGACGTA 


6TGGTCGTCG 


2460 


CCACCGACGC 


CCTCAT6ACG 


GGGTTTACTG 


GAGACTTTGA 


CTCCGT6ATC 


GACTGCAACG 


2520 


TAGCGGTCAC 


TCAAGTT6TA 


GACTTCA6CT 


TGGACCCCAC 


ATTCACCATA 


ACCACACAGA 


2580 


CTGTCCCTCA 


AGACGCTGTC 


TCACGTAGCC 


AGCGCCGGGG 


CCGCACGGGC 


AGGGGAAGAC 


2640 


T6GGTATTTA 


TAGGTATGTT 


TCCACTGGTG 


A6CGA6CCTC 


AGGAATGTTT 


GACAGTGTAG 


2700 


TGCTCTGCGA 


GTGCTACGAT 


GCAGGGGCCG 


CATGGTAT6A 


GCTCACACCA 


GCGGAGACCA 


2760 


CCGTCAGGCT 


CAGAGCATAT 


TTCAACACAC 


CTGGTTTGCC 


T6TGTGCCAA 


6ACCATCTTG 


2820 


AGTTTTGG6A 


GCAGTTTTC 


ACCGGCCTCA 


CACACATA6A 


TGCCCACTTC 


CTTTCCCAAA 


2880 


CAAAGCAATC 


GGGGGAAAAT 


TTCGCATACT 


TAACAGCCTA 


CCAGGCTACA 


GTGTGCGCTA 


2940 
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GGGCCAAAGC 


CCCCCCCCCG 


TCCTGGGACG 


CCACACTCGT 


GGGCCCCACA 


CCTCTCCTGT 


CCCTCACGCA 


TCCTGTGAC6 


AAATACATCG 


TGACCAGCAC 


GTG66TCTTA 


AAT A/\A/\ A/\/\ 

GCTGGGGGGG 


CGACCGGGTG 


TGTTTGCATC 


A XAAAAAAAT 

ATCGGCCGCT 


CGGACAAGGA 


G6TCCTCTAT 


GAG6CTTTT6 


CTCTCATTGA 


AGAGGGGCAG 


C66ATA6CCG 


TGCAGCAAGC 


TTCCAAACAA 


GCTCAAGACA 


AGGTAGAGCA 


ATTCTGGGCC 


AAACACATGT 


CAGGACTATC 


AACACTGCCA 


GGGAACCCTG 


CCCTCACCA6 


TCCGTTGTCA 


ACTAGCACCA 


TAGCATCCCA 


AATTGCGCCT 


CCCGCGGGGG 


GGGCTGCCGT 


A6GCAGCATA 


GGCTTGG6TA 


GTGCGGGCAT 


TTCGGGGGCT 


CTCGTCGCAT 


T66A6GAT6T 


TGTCAACCTG 


CTGCCTGGAA 


TCATCTGCGC 


6GCCATCCTG 


C6CCGACACG 


TGAATAGGCT 


CATT6CCTTT 


6CTTCCAGAG 


CGGA6TCG6A 


3970 





TCATGTG6AA 


GT6TTTGACT 


A A A A ir A A A A A 

C6ACTCAAGC 


3000 


ACC6CTTGGG 


CTCTGTTACC 


AACGAGGTCA 


3060 


CCACCT6CAT 


AAA A AAAAAA 

6CAAGCCGAC 


ATXA A AATTA* 

CTTGAGGTCA 


3120 


TAfT AAAA AA 

TCTTGGCGGC 


A AT AA AAA A A 

CGTCGCCGCG 


T A ATAAAXAA 

TACTGCCTG6 


3180 


TGCACGTTAA 


CCAGCGAGCC 


GTCGTTGCAC 


3240 


AT6A6ATGGA 


6GAATGTGCC 


TCTAGAGCGG 


3300 


AGATGCTGAA 


6TCCAAGATC 


CAAGGCTTAT 


3360 


TACAACCCGC 


TGTGCA6GCT 


TCTTGGCCCA 


3420 


6GAACTTCAT 


CAGCGGCATT 


CAATACCTCG 


3480 


CTGTAGCTTC 


CATGATGGCA 


TTCA6T6CCG 


3540 


CTATCCTTCT 


CAACATTTTG 


GGGGGCTGGC 


3600 


CTACCG6CTT 


C6TCGTCAGT 


GGCCTGGTGG 


3660 


A6GTGCTG6T 


GGACATCCTG 


GCAGGGTATG 


3720 


TCAAGATCAT 


GTCTG6CGAG 


AAGCCCTCCA 


3780 


TTCTGTCTCC 


GGGTGCCCT6 


GTGGTGGGAG 


3840 


TGGGACCGGG 


GGAAGGCGCT 


GTCCAATGGA 


3900 


GAAACCACGT 


CGCCCCCACC 


CACTACGTGA 


3960 
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Sequence ID No.4 
Sequence Length; 2,693 
Sequence Type: nucleic acid 
Strandedness; single 
Topology: linear 

Molecule Type: cDNA to genomic RNA 
Method for Determination of Feature: E 



ATTCTGTCTC 


CGGGTGCCCT 


G6TG6TGGGA GTCATCTGCG 


CGGCCATCCT 


GCGCCGACAC 


60 


GTGGGACCGG 


GGGAAGGCGC 


TGTCCAATGG ATGAATAGGC 


TCATTGCCTT 


TGCTTCCA6A 


120 


GGAAACCACG 


TCGCCCCCAC 


CCACTACGTG ACG6A6TCGG 


AT6CGTC6CA 


GCGTGTGACC 


180 


CAACTACTTG 


GCTCCCTTAC 


CATAACCAGC CTGCTCAGGA 


GACTCCACAA 


CTGGATTACT 


240 


GAAGACTGCC 


CCATCCCATG 


CAGCGGCTC6 TGGCTCCGCG 


ATGTGTGGGA 


TTGGGTTTGC 


300 


ACCATCCTAA 


CAGACTTTAA 


AAACTGGCTG ACCTCCAAAT 


T6TTCCCAAA 


GATGCCTGGT 


360 


CTCCCCTTTA 


TCTCTTGTCA 


AAAGGGGTAC AAGGGCGTGT 


GGGCTGGCAC 


TGGTATCATG 


420 


ACCACACGGT 


GTCCTTGCG6 


CGCCAATATC TCTGGCAATG 


TCC6CCTGGG 


CTCCATGAGA 


480 


ATTACGGGGC 


CCAAAACCT6 


CATGAATATC TGGCA6GGGA 


CCTTTCCCAT 


CAATTGTTAC 


540 


ACGGAGGGCC 


AGTGC6TGCC 


6AAACCCGCA CCAAACTTTA 


AGATCGCCAT 


CTGGAGGGTG 


600 


GCGGCCTCAG 


AGTACGCGGA 


GGTGACGCAG CACGGGTCAT 


ACCACTACAT 


AACAG6ACTT 


660 


ACCACTGATA 


ACTTGAAAGT 


TCCTTGCCAA CTACCTTCTC 


CAGAGTTCTT 


TTCCTGGGTG 


720 


GACGGAGTGC 


AGATCCATA6 


GTTTGCCCCC ATACCGAAGC 


CGTTTTTTCG 


GGATGAGGTC 


780 


TCGTTCTGCG 


TTGGGCTTAA 


TTCATTTGTC GTCGGGTCTC 


AGCTCCCTTG 


CGATCCTGAA 


840 


CCTGACACAG 


ACGTATTGAC 


GTCCATGCTA ACAGACCCAT 


CCCATATCAC 


GGCGGAGACT 


900 


GCAGCGCG6C 


GTTTGGCACG 


G6GGTCACCC CCGTCCGAGG 


CAAGCTCCTC 


AGCGAGCCAG 


960 


CTATCGGCAC 


CATCGCTGCG 


AGCCACCTGC ACCACCCACG 


GCAAGGCCTA 


TGATGTGGAC 102a 


ATGGTGGATG 


CCAACCTGTT 


CATGGGGGGC GATGTGACCC 


GGATAGAGTC 


TGAGTCCAAA 1080 


6TGGTCGTTC 


TGGACTCTCT 


CGACCCAATG GTCGAAGAAA 


6GAGCGACCT 


T6AGCCTTCG lUa 


ATACCATCGG 


AATATATGCT 


CCCCAAGAAG AGATTCCCAC 


CAGCCTTACC 


GGCTTGGGCA 1200 
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CGGCCTGATT 


ACAACCCACC 


GCTTGTGGAA 


TCGTGGAAGA 


GGCCAGATTA 


CCAACCGGCC 


1260 


ACTGTTGCGG 


GCTGCGCTCT 


CCCCCCCCCT 


AAGAAAACCC 


CGACGCCTCC 


CCCAAGGAGA 


1320 


CGCCGGACAG 


TGGGTCTGAG 


T6AGAGCTCC 


ATAGCAGATG 


CCCTACAACA 


GCTGGCCATC 


1380 


AAGTCCTTTG 


6CCAGCCCCC 


CCCAAGCGGC 


GATTCAGGCC 


TTTCCACGGG 


GGCGGACGCA 


1440 


GCCGATTCC6 


6CAGTCG6AC 


GCCCCCCGAT 


GAGTTGGCCC 


TTTCGGAGAC 


AGGTTCCATC 


1500 


TCCTCCATGC 


CCCCTCTCGA 


GGGGGAGCCT 


G6AGATCCAG 


ACTTGGAGCC 


T6AGCAG6TA 


1560 


GAGCTTCAAC 


CTCCCCCCCA 


GGGGGGGGTG 


6TAACCCCCG 


6CTCAGGCTC 


GGGGTCTTGG 


1620 


TCTACTTGCT 


CCGAGGAGGA 


CGACTCC6TC 


GTGTGCTGCT 


CCATGTCATA 


CTCCTGGACC 


1680 


GGGGCTCTAA 


TAACTCCTTG 


TAGCCCCGAA 


GAGGAAAAGT 


TGCCAATT6G 


CCCCTTGAGC 


1740 


AACTCCCTGT 


TGCGATATCA 


CAACAAGGTG 


TACT6TACCA 


CATCAAAGAG 


CGCCTCATTA 


1800 


AGGGCTAAAA 


AGGTAACTTT 


TGATAGGATG 


CAAGCGCTCG 


ACGCTCATTA 


TGACTCAGTC 


1860 


TTGAAGGACA 


TTAAGCTA6C 


GGCCTCCAAG 


GTCACCGCAA 


6GCTTCTCAC 


TTTA6AGGA6 


1920 


GCCTGCCAGT 


TAACTCCACC 


CCACTCT6CA 


A6ATCCAAGT 


ATGGGTTTGG 


6GCTAAGGA6 


1980 


GTCCGCAGCT 


TGTCCGGGAG 


AGCCGTTAAC 


CACATCAAGT 


CCGTGTGGAA 


GGACCTCCTG 


2040 


GAA6ACACAC 


AAACACCAAT 


TCCTACAACC 


ATCATGGCCA 


AAAATGAG6T 


GTTCTGCGTG 


2100 


GACCCCACCA 


AGGGGGGTAA 


6AAAGCAGCT 


CGCCTTATCG 


TTTACCCTGA 


CCTCGGCGTC 


2160 


AGGGTCTGCG 


AGAAAATGGC 


CCTTTAT6AT 


ATCACACAAA 


AGCTTCCTCA 


GGCGGTGATG 


2220 


GGG6CTTCTT 


ATGGATTCCA 


GTACTCCCCC 


GCTCAGCGGG 


TG6A6TTTCT 


CTTGAAGGCA 


2280 


TGGGCGGAAA 


AGAAAGACCC 


TATGGGTTTT 


TC6TATGATA 


CCCGATGCTT 


TGACTCAACC 


2340 


HTCACTGAGA 


GA6ACATCAG 


GACTGAGGAG 


TCCATATATC 


6GGCTTGTTC 


CTTGCCCGAG 


2400 


GAGGCCCACA 


CTGCCATACA 


CTCACTGACT 


GAGAGACTTT 


ACGTGGGAGG 


GCCCATGTTC 


2460 


AACAGCAAGG 


GCCAGACCTG 


CGGGTACAGG 


CGTTGCCGCG 


CCA6CGGG6T 


GCTTACCACT 


2520 


AGCATGGGGA 


ACACCATCAC 


ATGCTATGTG 


AAA6CCTTAG 


CGGCCTGTAA 


GGCTGCAGGG 


2580 


ATAATTGCGC 


CCACAATGCT 


GGTATGCGGC 


GATGACTT6G 


TTGTCATCTC 


AGAGAGCCAG 


2640 


GGGACCGAG6 


AGGACGAGCG 


6AACCT6AGA 


GCCTTCACG6 


A6GCTATGAC 


CAG 2693 
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Sequence ID No.5 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 IS 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro Me Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

30 85 90 

Leu Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Ser Trp Gly Pro Asn Asp Pro Arg His Arg Ser Arg Asn Val Gly 

110 115 120 

Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly 

125 130 135 

Tyr lie Pro Val Val Gly Ala Pro Leu Gly Gly Val Ala Arg Ala 

140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Phe Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala Glu Val Lys 

185 190 195 

Asn He Ser Thr Gly Tyr Het Val Thr Asn Asp Cys Thr Asn Asp 

200 205 210 

Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Lys Val Gly Asn Thr Ser Arg Cys Trp He 

230 2 35 240 

Pro Val Ser Pro Asn Val Ala Val Gin Gin Pro Gly Ala Leu Thr 

245 250 255 

Gin Gly Leu Arg Thr His He Asp Het Val Val Met Ser Ala Thr 

260 265 270 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Het 

275 2 8 0 2 85 

Leu Ala Ala Gin Het Phe He Val Ser Pro Gin His His Trp Phe 

290 295 300 

Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Het Asn Trp Ser Pro Thr Ala 

320 325 330 

Thr Het He Leu Ala Tyr Ala Het Arg Val Pro Glu Val He He 

335 340 345 

Asp He He Gly Gly Ala His Trp Gly Val Het Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Val He Leu 

365 370 375 

Leu Leu Ala Ala Gly Val Asp Ala Gin Thr His Thr Val Gly Gly 
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380 385 390 

Ser Thr Ala His Asn Ala Arg Thr Leu Thr Gly Het Phe Ser Leu 

395 400 405 

Gly Ala Arg Gin Lys lie Gin Leu He Asn Thr Asn Gly Ser Trp 

410 415 420 

His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu His Thr 

425 430 435 

Gly Phe Leu Ala Ser Leu Phe Tyr Thr His Ser Phe Asn Ser Ser 

440 445 450 

Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala Phe 

455 460 465 

Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 

470 475 480 

Pro Glu Asp Het Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin 

485 490 . 495 

Cys Gly Val Val Ser Ala Ser Ser Val Cys Gly Pro Val Tyr Cys 

500 505 510 

Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly 

515 520 525 

Ala Pro Thr Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu 

530 535 540 

Leu Asn Ser Thr Arg Pro Pro Gin Gly Ser Trp Phe Gly Cys Thr 

545 550 555 

Trp Het Asn Ser Thr Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro 

560 565 570 

Cys Arg He Arg Ala Asp Phe Asn Ala Ser Het Asp Leu Leu Cys 

575 580 585 

Pro Thr Asp Cys Phe Arg Lys His Pro Asp Thr Thr Tyr He Lys 

590 595 600 
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Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu He Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Tyr Thr He 

620 625 630 

Phe Lys He Arg Het Tyr Val Gly Gly Val Glu His Arg Leu Thr 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Asn Leu Glu Asp 

650 655 660 

Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 6 75 

Trp Ala He Leu Pro Cys Thr Tyr Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Phe 

695 700 705 

Het Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 

710 715 720 

Glu Trp Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Het Leu He Leu Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu Val Val Leu His Ala Ala Ser Ala Ala Ser 

755 760 765 

Cys Asn Gly Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp 

770 775 780 

Tyr He Lys Gly Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr 

785 790 795 

Gly Leu Trp Ser Phe Gly Leu Leu Leu Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Tyr Asp Ala Ser Val His Gly Gin He Gly Ala Ala 
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815 820 825 

Leu Leu Val Leu He Thr Leu Phe Thr Leu Thr Pro Gly Tyr Lys 

830 835 840 

Thr Leu Leu Ser Arg Phe Leu Trp Trp Leu Cys Tyr Leu Leu Thr 

845 850 855 

Leu Ala Glu Ala Met Val Gin Glu Trp Ala Pro Pro Het Gin Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Ala Val Ala He Phe Cys 

875 880 885 

Pro Gly Val Val Phe Asp He Thr Lys Trp Leu Leu Ala Val Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg Val Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Het Cys Thr Het Val Arg 

920 925 930 

His Leu Ala Gly Gly Arg Tyr Val Gin Het Val Leu Leu Ala Leu 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Thr Pro Het 

950 955 960 

Ser Asp Trp Ala Ala Asn Gly Leu Arg Asp Leu Ala Val Ala Val 

965 970 975 

Glu Pro He He Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Ala Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Thr lie Val Val Ser 

1040 1045 1050 

Het Thr Gly Arg Asp Lys Thr Glu Gin Ala Gly Glu He Glu Val 

1055 1060 1065 

Leu Ser Thr Val Thr Gin Ser Phe Leu Gly Thr Thr He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Ser Arg Gly Pro Val Thr Gin Het Tyr Ser Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Glu 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 
Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Thr Leu Asp He Val Thr Arg Ser Pro Thr Phe Ser 

1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gin 

1220 1225 1230 
Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 
Val Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thp Val Thr 

1280 1285 1290 

Thr Gly Ala Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Gly Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ala Val Asp Ser Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Thr Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly Gin Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Arg Ala He Pro Leu Ser Tyr He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Mel Gly Leu Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Thr Gin Val Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

1490 1495 1500 

Gly lie Tyr Arg Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Het 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Trp Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ser Val 
.1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu Val Het Thr Ser Thr Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys Val Cys He He Gly Arg Leu His Val Asn Gin Arg 
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1685 1690 1695 

Ala Val Val Ala Pro Asp Lys Glu Val Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Met Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg He Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740. 

Gin Gin Ala Ser Lys Gin Ala Gin Asp He Gin Pro Ala Val Gin 

1745 1750 1755 

Ala Ser Trp Pro Lys Val Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Ser Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Leu Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 

Glu Lys Pro Ser Het Glu Asp Val Val Asn Leu Leu Pro Gly He 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Het 

1910 1915 1920 

Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Leu Leu Gly Ser Leu Thr lie Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Asn Trp He Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Arg Asp Val Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Thr Ser Lys Leu Phe Pro Lys Het Pro Gly Leu 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly He Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly Asn Val Arg Leu Gly Ser Het Arg He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Het Asn He Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Gin Cys Val Pro Lys Pro Ala Pro Asn Phe Lys He Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro lie Pro Lys Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Ser Phe Cys Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu 

2165 2170 2175 

Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr 

2210 2215 2220 

Thr His Gly Lys Ala Tyr Asp Val Asp Met Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Glu Ser Lys Val 

2240 2245 2250 

Val Val Leu Asp Ser Leu Asp Pro Het Val Glu Glu Arg Ser Asp 

2255 2260 2265 

Leu Glu Pro Ser He Pro Ser Glu Tyr Het Leu Pro Lys Lys Arg 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Pro Leu Val Glu Ser Trp Lys Arg Pro Asp Tyr Gin Pro Ala Thr 

2300 2305 2310 

Val Ala Gly Cys Ala Leu Pro Pro Pro Lys Lys Thr Pro Thr Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser Glu Ser Ser He 

2330 2335 2340 
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Ala Asp Ala Leu Gin Gin Leu Ala He Lys Ser Phe Gly Gin Pro 

2345 2350 2355 

Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Asp Ala Ala 

2360 2365 2370 

Asp Ser Gly Ser Arg Thr Pro Pro Asp Glu Leu Ala Leu Ser Glu 

2375 2380 2385 

Thr Gly Ser He Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Leu Gin Pro Pro Pro 

2405 2410 2415 

Gin Gly Gly Val Val Thr Pro Gly Ser Gly Ser Gly Ser Trp Ser 

2420 2425 2430 

Thr Cys Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Met Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr 

2465 2470 2475 

His Asn Lys Val Tyr Cys Thr Thr Ser Lys Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Het Gin Ala Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Lys Asp He Lys Leu Ala Ala Ser Lys Val 

2510 2515 2520 

Thr Ala Arg Leu Leu Thr Leu Glu Glu Ala Cys Gin Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Arg Ser Lys Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 
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Arg Sen Leu Ser Gly Arg Ala Val Asn His He Lys Ser Val Trp 

2555 2560 2565 

Lys Asp Leu Leu Glu Asp Thr Gin Thr Pro He Pro Thr Thr He 

2570 2575 2580 

Met Ala Lys Asn Glu Val Phe Cys Val Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Ala Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 

Gin Ala Val Het Gly Ala Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Gin Arg Val Glu Phe Leu Leu Lys Ala Trp Ala Glu Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Arg Ala Cys 

2675 2680 2685 

Ser Leu Pro Glu Glu Ala His Thr Ala He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Phe Asn Ser Lys Gly Gin Thr 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 
- 2720 2725 2730 

Het Gly Asn Thr He Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He He Ala Pro Thr Het Leu Val Cys Gly Asp 

2750 2755 2760 
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Asp Leu Val Val He Sen Glu Sen Gin Gly Thr Glu Glu Asp Glu 

2765 2770 2775 

Arg Asn Leu Arg Ala Phe Thr Glu Ala Het Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Pro Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Gly Pro Gin Gly 

2810 2815 2820 

Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro He Ala 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Ala Arg Het 

2855 2860 2865 

Val Leu Het Thr His Phe Phe Ser He Leu Het Ala Gin Asp Thr 

2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Asp Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr 

2915 2920 2925 

Arg Val Ala Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser 

2945 2950 2955 

Arg Gly Gly Arg Ala Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 
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Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Arg 

2975 2980 2985 

Leu Leu Asp Leu Ser Ser Trp Phe Thr Val Gly Ala Gly Gly Gly 

2990 2995 3000 

Asp He Tyr His Ser Val Ser Arg Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Gly Leu Leu Leu Leu Phe Val Gly Val Gly Leu Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Sequence ID No.fc 

Sequence Length: 9,51 1 

Sequence Type: nucleic acid 

Strandedness: single 

Topology: linear 

Molecule Type: genomic RNA 

Method for Determination of Feature: E 



GCCCGCCCCC 


UGAUGGGGGC 


GACACUCCGC 


CAUGAAUCAC 


UCCCCUGUGA 


GGAACUACUG 


60 


UCUUCACGCA 


GAAAGCGUCU 


AGCCAUGGCG 


UUAGUAIIGAG 


UGUCGUACAG 


CCUCCAGGCC 


120 


CCCCCCUCCC 


GGGAGAGCCA 


UAGUGGUCUG 


CGGAACCGGU 


GAGUACACCG 


GAAUUACCGG 


180 


AAAGACUGGG 


UCCUUUCUUG 


GAUAAACCCA 


CUCUAUGUCC 


GGUCAUUUGG 


GCACGCCCCC 


240 


GCAAGACUGC 


UAGCCGAGUA 


GCGUUGGGUU 


6CGAAAGGCC 


UUGUGGUACU 


GCCUGAUAGG 


300 


GliRCliUGCGA 


6U6CCCCGGG 


AGGUCUCGUA 


GACCGUGCAU 


CAUGAGCACA 


AAUCCUAAAC 


360 


CUCAAAGAAA 


AACCAAAAGA 


AACACAAACC 


GCCGCCCACA 


GGACGUUAAG 


UUCCCGGGUG 


420 


GCGGUCAGAU 


CGUUGGCGGA 


GUUUACUUGC 


UGCCGCGCAG 


GGGCCCCAGG 


UUGGGUGUGC 


480 


GCGCGACAAG 


GAAGACUUCY 


GAGCGAUCCC 


AGCCGCGUGG 


AC6ACGCCAG 


CCCAUCCCGA 


540 


AAQAUCGGCG 


CUCCACCGGC 


AAGUCCUGGG 


GAAAGCCAGG 


AUAUCCUUGG 


CCCCUGUAC6 


600 


GAAACGAGGG 


UUGCGGCUGG 


GCGGGUUGGC 


UCCUGUCCCC 


CCGCGGGUCU 


CGUCCUACUU 


660 


GGGGCCCCAC 


C6ACCCCCGG 


CAUAGAUCAC 


GCAAUUUGGG 


CAGAGUCAUC 


GAUACCAUUA 


720 


CGUGUGGUUU 


UGCCGACCUC 


AUGGGGUACA 


UCCCUGUCGU 


UGGCGCCCC6 


GUYGGAGGCG 


780 


UCGCCAGAGC 


UCUGGCACAC 


GGUGUUAGGG 


UCCUGGAGGA 


CGGGAUAAAU 


UACGCAACAG 


840 


GGAAUUUACC 


CGGUUGCUCU 


UUUUCUAUCU 


UUUUGCUUGC 


(JCUUCUGUCA 


UGCGUCACAR 


900 


UGCCAGUGUC 


UGCAGUGGAA 


GUCAGGAACA 


UYAGUUCUAG 


CUACUACGCC 


ACUAAUGAUU 


960 


GCUCAAACAA 


CAGCAUCACC 


U6GCAGCUCA 


CUGACGCAGU 


UCUCCAUCUU 


CCUGGAUGCG 


1020 


UCCCAUGUGA 


GAAYGAUAAY 


GGCACCUUGC 


RUUGCUGGAU 


ACAAGUAACA 


CCCRACGUGG 1080 


CUGUGAAACA 


CCGCGGUGCG 


CUCACUCGUA 


GCCUGCGAAC 


ACACGUCGAC 


AUGAUCGUAA 


1140 



53 



EP0 532 167 A2 



UGGCAGCuAC 


A AAA 1 lA A II AA 

6GCCUGCUCG 


AAA IIIIAIi A II A 


UAuCGl/AuuC 


IIMIIA A IIAA IIA 


IIAA AA A A A A A 


AAA IIAIIA AAA 

CCAUCUACCA 


A/^AItAAAA IIA 

AGGUCACAUC 


A AAAAAA A MA 


AIIAA A A AIIA II 

CUCCAACUCU 


IIDAA A tIAA MA 

URCCAUGAUC 


AIIAAAAII A AA 

CUCGCCUACG 


■ 111 & ll\/tlll AA A 

UYAUYUUCGG 


AAAAAA lillAA 

CGGCCAUuGG 


AAIIAIIAAUAII 

GGuGUGGYGU 


/%/\ltAAAAAA A 

CGUGGGCCAA 


A A IIADilVA AA 

AGUCRUYGCC 


A IIAAIIAAIIIIA 

AUCCuCCUUC 


J\J\k A AAA\f A A 

CCASCGGYCA 


AC A A AAAAAII 

GSAAGCGGGU 


AAII r>AAA\/A 1/ 

CGURCCGYCK 


/\AA A A A 4 A A t 

CCAAGCAGAA 


AAllAlfA llillin 

CCUCYAUUUR 


A 1 lA A A A A A A A 

AuCAACACCA 


UCAAUUGCAA 


1 lA A A A A A\/ It A 

UGACAGCYUA 


A A A A AAA Aim 

SAGACGGGUU 


*/\AAAIIA 

UCAACAGCuC 


IIA AA 1 lA AA AA 

UG6CUGCCCC 


AAA AAA 1 1 1 lA 1 1 

GAGCGCUUGU 


IIAAAAIIAAAA 

uCGGCUGGGG 


A A AAllll AA A A 

AACCUUGGAA 


II A AAA A A A A A 

UACGAAACCA 


1 /V 1 1 A/\ 1 1/\ AA A 

ACUGCUG6CA 


1 III A A AAAAAA 

UUACCCCCCG 


A AA AAI 11 1 AAA 

AGGCCuUGCG 


/lAAIlAIIA MIIA 

CGGUCuAUUG 


l/ll 1 lA A A AA Al 1 

YUUCACCCCu 


A A AAA IIA III lA 

AGCCCUGUUG 


/\AA/\AIIAAA A 

CCACCUACAC 


AIIAA AA A A A A 

CUGGGGRGAA 


A AAAAAAAAA 

AAC6AGACCG 


A AAA A AA A AA 

CGCGAGGAGC 


1 III AA III 1 AA AA 

UUGGUUCGGC 


IIAAAA\/IIAAA 

UGCACYUGGA 


A IIA A A A A AAA 

GUGCACCACC 


lltlAAAAAA lilt 

UUGCCGCAUl) 


A A A A A A A A A 1 1 

AGGAAAGACU 


n A/\ AAIIAIIIIII 

CAGACUGUUU 


MA A A A A AA A A 

UAGGAA6CAC 


AA A A A 1 lA A 1 1 A 

CCAGAUGCUA 


II A A y\ii/\/vy\A A 

UAACUCCCAG 


A 1 iy\/\/\i 1 f\/\ 1 1 A 

GUGCCUGGUA 


GACUACCCUU 


ACUUCACCAU 


CUUYAAGGCG 


CGGAUGUAUG 


A A IIAAA A Aim 

CAUGCAACUU 


A A AA AA A/\/\ A 

CACGCGCGGA 


AAI l/XA A IIAAA 

GAUCGCUGCA 


A IIAAA AIIAA II 

GUCCACUGCU 


A A A IIIIAAA All 

GCAUUCCACu 


A AIIAA AIIAAA 

ACUGAGUGGG 


CAGCACUAUC 


CACUGGCCUA 


UUGCACCUCC 


ACGGACUUUC 


UCCGGCUCUG 


ACAAGAUACA 


UCUUGUUGUU 


GGCAGACGCC 


AGGRUCUGUG 


AAGCCGAAGC 


GGCGCUUGAG 


AAGCUCAUCA 


AUGGUCCGCU 


GUGGUUUIiUC 


AUCUUCUUUA 


UCCCCGUGGC 


CACGUAC'JCU 


GU6CUCGGCU 


UACCACAGCA 


GGCUUAUGCC 


UUGGACGCUG 


UAGUAAUUAU 


AUCCAIICUUU 


ACUCUUACCC 



UGGGAGAUGU 6UGCGG6GCC GUGAUGAUYC 1200 
6CCACAACUU CACCCAAGAG UGCAACUGUU 1260 
6CAUGGCAUG GGACAU6AUG CURARCUGGU 1320 
CY6CUC6Y6U UCCCGARCUG GUCCUCGAAA 1380 
UYGGCUUGGS CUAUUUCUCC AUGCARGGAG 1440 
UUGUUGCGGG AGUGGAUGCA WCCACCUAUU 1500 
HKGGGWUCKC URGCCUCUUU AHUACUGGUG 1560 
AUGGCAGCUG GCACAUAAAC CGQACUGCCC 1620 
liCHUCGCUUC CYUGKUUUAC WHCCRCARGU 1680 
CUUCCUGCCG C66GCUG6AC GAVUUYCGCA 1740 
ACGUCACCAA CGAUGRGGAC AUGAGGCC6U 1800 
GCAUCGUCCC GGCUAGGACG 6UUUGCGGAC 1860 
UCGUGGGCAC CACUGACAAG CAGGGCGUAC 1920 
AUGUCUUCCU GCRAAAUAGC ACAAGACCCC 1980 
UGAACGGGAC UG6GUUCACU AAGACAUGC6 2040 
ACAACAGCAC UCUCGAUUUA UUGUGCCCCA 2100 
CCUAUCUUAA GUGUGGAGCA GGGCCUUGGU 2160 
AUAGRYUGUG GCAUUAUCCG UGCACUGUAA 2220 
UAGGAGGGGU GGAGCAUCGA UUCUCCGCAG 2280 
GACUGGAAGA UAGGGAUAGG GGYCAGCAGA 2340 
C6GUGYUCCC AUGCUCCUUC UCUGACCUAC 2400 
ACCAAAACAU CGUGGACGUG CAGUACCUYU 2460 
UCGUGAAGUG GGAGUGGGUG AUCCUCCUUU 2520 
CAUGCCUUUG GAUGCUCAHC AUACUGGGCC 2580 
UCUUGCACUC C6CUA6YGCU GCUAGUGCCA 2640 
CAGCGGCCUG GUACUUAAAG GGCAGGGUGG 2700 
URUGGUCCUU CCUCCUCCUA GUCCUG6CYU 2760' 
CUGAACAAGG GGAACUGGGG CUGGCCAUAU 2820 
CAGCAUACAA GAUCCUCCUG AGCCGUUCAG 2880 
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liGUGGUGGCU GUCCUACAUG CUGGUCUUGG 
CCCUGGAGGU CCGAGGGGGG CGUGACGGGA 
GCCUUGUGUU UGAGGUCACG AAAUGGUUQU 
RAGCGUCUCll GCUACGGAUA CCGUACUUUG 
CCCUGGUGAA ACACCUCGCR GGGGCUAGGU 
GAUGGACCGG CACUUACAUC UACGACCACC 
GUUURCGGGA CCUGGCAAUC GCCGUGGAGC 
UCAUUGUGUG GGGGGCUGAG ACAGUGGCGU 
CCGCGAGGCU AGGUAGGGAR GUUCUGCUCG 
GGAAKCUCCU AGCUCCCAUU ACUGCUUACA 
UCGUGGUCAG CCUAACGGGC CGCGACAAAA 
CCUCCGUCAC ACAAACUUUC UUGG6GACAU 
ACGGGGCUGG UAAUAAGACC UUGGCCGGCC 
GCGCAGAAGG GGACCUCGUG GGAUGGCCUA 
GUACCUGCGG GGCCGUAGAC CUCUACCUGG 
GGAGGAAAGA UGACCGACGG GGUGCAUUAC 
GAUCAUCCGG AGGGCCCGUG CUCUGCUCWA 
CCGUGUGUGC CAGGGGUGUA GCCAAAUCUA 
UCGCCACACG GACGCCCAGU UUCUCUGACA 
ACCAGGUG6G UUACUUGCAC GCACCAACAG 
CGUAUGCCAG UCAGGGGUAU AAAGUACUCG 
GUUUUGGGGC CUACAUGUCC AAAGCCCACG 
GGACCGUUAC CACCGGGGAC UCUAUCACUU 
GAGGCUGUGC AGCCGGUGCC UAUGACAUCA 
CUACUACCAU CCUUGGCAUU GGAACAGUCC 
UAGUGGUYUU GGCCACAGCC ACGCCUCCCG 
AGGAGGUGGC CCUUGGUCAC GAGGGCGAGA 
CUUUCAUCAA GGGGGGCAGA CACUUGAUCU 
UCGCAGCGGC CCUCCGGGGC AYGGGUGUCA 



CC6AGGCCCA 6AUUCAGCAA UGGGUUCCCC 2940 
UCAUCUGGGU 6GCUGUCAUU CUACACCCAC 3000 
UAGCAAUCCU 6GGGCCUGCC UACCUCCUUA 3060 
UGAGGGCCCA CGCUUUGCUA CGAGUGUGUA 3120 
ACAUCCAGAU GCUGUURAUC ACCAUAGGCA 3180 
UCUCCCCUUU AUCAACUUGG GCGGCCCAGG 3240 
CUGUGGUGUU CAGCCCAAUG GAGAAGAAGG 3300 
GUGGAGACAU CCUGCAUGGC CUCCCGGUCU 3360 
GCCCUGCCGA CGGCUACACC UCCAAGGGGU 3420 
CUCAGCAAAC UCGUGGUCUC CUGGGUGCUA 3480 
AUGAGCAGGC UGGGCAGGUC CAGGUUCUGU 3540 
CCAUUUCGGG CGUCCUCUGG ACAGUAUAUC 3600 
CCAAGGGACC AGUCACUCAG AUGUACACCA 3660 
GUCCCCCCG6 GACUAAGUCA UUGGACCCCU 3720 
UCACCCGAAA CGCUGAUGUC AUUCCGGUCC 3780 
UCUCGCCAAG GCCCCUCUCA ACCCUCAAAG 3840 
GGGGACACGC CGUGGGCUUG UUCAGAGC6G 3900 
UUGACUUCAU CCCCGUC6AA UCACUCGAUR 3960 
ACAGURCGCC GCCAGCUGUG CCCCA6UCUU 4020 
GCAGCGGAAA GA6CACCAAG GUCCCUGCCG 4080 
UACUAAAUCC CUCUGUCGCG GCCACACUUG 4140 
G6AUCAACCC UAAUAUCAGA ACU66AGUGC 4200 
ACUCCACUUA UGGCAAGUUU AUCGCA6AUG 4260 
UCAUAUGCGA CGAAUGCCAU UCAGU6GACG 4320 
UUGACCAAGC UGA6ACCGCA GGCGUCAGGC 4380 
GUACGGUGAC AACUCCCCAC AGUAACAUAG 4440 
UCCCUUUUUA U6GCAAA6CU AUUCCCCUAG 4500 
UUUGCCAUUC AAAGAAGAAG UGCGACGAGC 4560 
AUGCCGUUGC AUACUAUAGG GGUCUCGACG 4620 
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UCUCCGUUAl) ACCAACUCAA GGAGACGUGG UGGUUGUCGC CACUGAUGCC CUAAUGACUG 4680 
GGUACACCGG CGACUUU6AC UCYGUCAUC6 ACUGUAAUGU UGCAGUCUCU CA6AUUGUUG 4740 
ACUUCAGCCU AGACCCAACC UUCACCAUGA CCACUCAAAC C6UCCCUCAG GACGCUGUCU 4800 
CCCGUAGUCA ACGUAGAGGG AGAACUGGGA GGGGGCGAUU GGGCRUUUAC AGGUAUGUUU 4860 
CGUCAG6YGA RRGGCCGUCU GGGAUGUUCG ACAGCGUAGU GCYCUGCGA6 UGCUAUGAUG 4920 
CCGGG6CAGC CU6GUACGA6 CUUACACCU6 CUGAGACUAC GGUGAGACUC CG66CYUAUU 4980 
UCAACACGCC CGGUUUGCCC GUAUGUCAA6 ACCACCUGGA GUUCUG6GAA GCGGUCUUUA 5040 
CAGGUCUCAC WCACAUURAC GCCCACUUCC UCUCCCAGAC GAAGCAAGGA GGA6AAAACU 5100 
UUGCRUAUCU AACGGCCUAC CAGGCCACA6 UAU6CGCCAG GGCAAAGGCC CCUCCUCCUU 5160 
CGUG66ACGU GAUGUGGAAG UGUCUAACUA GGCUCAAACC UACACUGACU 6GUCCCACCC 5220 
CCCUCCUGUA CCGCUUGGGU GCCGUGACCA AUGAGGUYAC CUUGACGCAC CCCGUGAC6A 5280 
AAUACAUCGC CACGUGCAUG CAAGCUGACC UYGA6AUCAU GACAAGCUCA UGGGUCCUGG 5340 
CGGGGGGGGU GCUAGCCGCC GUGGCAGCUU ACUGCCUG6C GACUGGCUGC AUUUCCAUCA 5400 
UUGGCCGCCU ACACCUGAAU GAUCGGGUGG UUGUGRCCCC YGACAAG6AR AUCUUAUAUG 5460 
A6GCCUUUGA UGAGAUGGAA GAAUGCGCCU CCAAAGCCGC CCUCAUUGAG GAAGGGCAGC 5520 
GGAUGGCGGA GAUGCUCAAA UCUAA6AUAC AAGGCCUCCU ACAACAGGCC ACAAGGCAAG 5580 
CUCAAGRCAU RCAGCCAGCU AUACAGUCAU CAUGGCCCAA GCUUGAACAA UUUUGGGCCA 5640 
AACACAU6UG GAACUUCAUC AG.UGGUAUAC AGUACCUAGC AGGACUCUCC ACCCUACCGG 5700 
GAAAUCCUGC AGURGCAUCA AUGAUGGCUU UUAGCGCCGC 6CUGACUAGC CCACUACCCA 5760 
CCA6CACCAC CAUCCUCUUG AACAUCAU66 GAGGAUGCUU 6GCCUCYCA6 AUUGCCCCCC 5820 
CU6CC6GAGC CACY6GCUUC GUUGUCAGUG GUCUAGUGGG GGCGGCC6UC GGAAGCAUA6 5880 
GCCUGGGUAA GAUACUGGU6 GACGUUUUGG CCGGGUACGG CGCA6GCAUU UCA66GGCCC 5940 
UCGUAGCUUU UAAGAUCAU6 A6CGQC6A6A AGCCCACGGU AGAAGACGUU GUGAAUCUCC 6000 
OGCCUGCUAU YCUGUCUCCU GGUGCGYU6G UAGUGGGAGU CAUCUGUGCA GCAAUYCUGC 6060 
GCC6CCACGU CGGUCAGGGA 6A6GGR6CG6 UCCAGUG6AU 6AACAGACUG AUC6CCUUC6 6120 
CCUCCAGGGG AAACCACGUU GCCCCUACCC ACUACGUGGU GGAGUCUGAC GCUUCACAGC 6180 
GUGURACGCA GGUGCUGAGU UCACUUACAA UUACCAGCUU ACUUA6GAGA CUACAUGCCU 6240 
GGAUCACUGA AGAUUGCCCA RUCCCAUGCU CGGGGUCUUG GCUCCAGGAC AUUUG66AUU 6300 
GG6UUUGUUC CAUCCUCACA GACUUYAAAA ACUG6CUGUC UUCAAAAUUA CUCCCCAAGA 6360 
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UGCCCGGCAU 


UCCCUUUAUC 


UCUUGCCAGA 


AGGGAUACAA 


GGGUGUAUGG 


GCUGGUACGG 


6420 


GUGUCAUGAC 


YACUCGRURC 


CCAUGUGGAG 


CAAACAUCUC 


GGGCCAUGUC 


CGCAUGGGCA 


6480 


CCAUGAAAAU 


AACAGGCCCG 


AAGACUliGCy 


UGAACCUGUG 


GCAGGGGACU 


UUCCCCAUUA 


6540 


AUUGUUACAC 


AGAAGG6CCY 


U6CGUGCCAA 


AACCCCCUCC 


UAAUUACAAG 


ACCGCAAUUU 


6600 


GGAGGGUGGC 


AGCGUCGGAG 


UACGUUGAGG 


UCACACAGCA 


UGGCUCUUUC 


UCGUAUGUAA 


6660 


CRGGGUUAAC 


CAGUGACAAC 


CUUAAGGUYC 


CUUGCCA6GU 


ACCAGCUCCA 


GAAUUUUUCU 


6720 


CUUGGGUGGA 


CGGGGUGCAA 


AUCCACCGAU 


UCGCCCCGGU 


WCCAGGUCCC 


UUCUUUCGGG 


6780 


AUGAGGUAAC 


GUUCACCGUA 


GGCCUUAACU 


CCUUCGUGGU 


CGGCUCUCAG 


CUCCCUUGCG 


6840 


AUCCUGAGCC 


GGACACCGAR 


GUACUGGCCI) 


CYAUGUUGAC 


AGACCCGUCC 


CACAUCACCG 


6900 


CKGAGGCGGC 


AGCCAGGCGA 


UUGGCAAGGG 


GAUCUCCCCC 


YUCACAGGCU 


AGCUCCUCA6 


6960 


CGAGCCAGCU 


CUCUGCCCCG 


UCCUUGAAGG 


CUACCUGUAC 


CACCCAUAAG 


ACAGCAUAUG 


7020 


AUUGUGACAU 


GGUGGAUGCY 


AACCUUUUCA 


(JGGGAGGHGA 


UGUGAYCCGG 


AUUGAGUCUG 


7080 


ACUCUAAGGU 


GAUCGUUCUA 


GACUCCCUC6 


AUUCCAUGAC 


UGAGGUAGAG 


GAUGAUCGUG 


7140 


AGCCUUCU6U 


ACCAUCAGAG 


UACCUGAUCA 


AGAGGAGAAA 


GUUCCCACCG 


GCGCUGCCUC 


7200 


CUUGGGCCCG 


UCCAGACUAC 


AAUCCUGUUU 


UGAUCGAGAC 


AUGGAAGAGG 


CCGGGCUAUG 


7260 


AACCACCCAC 


UGUCCUAGGC 


UGUGCCCUCC 


CCCCCACACY 


UCAAACGCCA 


GUGCCUCCAC 


7320 


CUCGGAGGCG 


CCGCGCYAAA 


RUCCUGACCC 


AGGACRAUGU 


GGAGGGGRUC 


CUCAGGGAGA 


7380 


UGGCUGACAA 


AGURCUCAGC 


CCUCUCCAAG 


ACAACAAUGA 


CUCCGGUCAC 


UCCACUGGAG 


7440 


CGGAUACCGG 


AG6AGACAUC 


GUCCAGCAAC 


CCUCUGACGA 


GACUGCCGCU 


UCAGAAGCGG 


7500 


GGUCACUGUC 


CUCCAUGCCU 


CCCCUUGAGG 


GAGAGCCGGG 


AGACCCYGAC 


CUGGAGUUUG 


7560 


AACCAGUGGG 


AUCCGCUCCC 


CCUUCUGAGG 


GGGAGUGUGA 


GGUCAUUGAU 


UCGGACUCUA 


7620 


AGUC6UGGUC 


CACAGUCUCU 


GAUCAAGAGG 

U r\ U vo r\U nU U 


AlilirilGlltlAII 


nifiriifiriirii 

w U U V/ U Vl\/ UV/ u 




1 vOv 


CCUGGACGGG 


GGCCCUCAUA 


ACACCAUGUG 


GGCCCGAAGA 


QGAGAAGUliA 


CCGAUCAACC 


7740 


CUCUGAGUAA 


UUCGCUCAUG 


CGGUUCCAUA 


AYAAGGUGUA 


CUCCACAACC 


UCGAGGAGUG 


7800 


CCUCUCUGA6 


GGCAAAGAAG 


GUGACUUUUG 


ACAGGGUGCA 


GGUGCUGQAC 


GCACACUAUG 


7860 


ACUCAGUCUU 


GCAGGACGUU 


AAGCGGGCCG 


CCUCUAAGGU 


URGUGCGAGG 


CUCCUCACAG 


7920 


UAGAGGAA6C 


CUGCGCGCUG 


ACCCC6CCCC 


ACUCCGCCAA 


AUCGCGAUAC 


GGAUUUGGGG 


7980 


CAAAAGAGGU 


GCGCAGCUUA 


UCCAGGAGGG 


CCGUUAACCA 


CAUCCGGUCC 


GUGUGGGAGG 


8040 


ACCUCCUGGA 


AGACCAACRU 


ACCCCAAUU6 


ACACAACUAU 


CAUGGCUAAA 


AAUGAGGUGU 


8100 
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UCUGCAUUGA 


UCCAACUAAR 


GGUGGGAAAA 


A6CCAGCUCG 


CCUCAUCGUA 


UACCCCGACC 


8160 


UUGGGGUCAG 


GGUGUGCGAA 


AA6AU6GCCC 


UCUAUGACAU 


CRCACAAAAG 


CUUCCCAAAG 


8220 


CGAUAAUGGG 


GCCAUCCUAU 


GGGUUCCAAU 


ACUCUCCCGC 


AGAACGGGUC 


GAUUUCCUCC 


8280 


UCAAAGCUUG 


GGGAAGUAAG 


AAGGACCCAA 


UGGGGUUCUC 


GUAUGACACC 


CGCUGCUUUG 


8340 


ACUCAACCGU 


CACGGAGAGG 


GACAUAAGAA 


CAGAAGAAUC 


CAUAUAUCAG 


GCUUGUUCUC 


8400 


UGCCUCAAGA 


AGCCAGAACU 


GUCAUACACU 


C6CUCACUGA 


GAGACUUUAC 


GUAGGAGGGC 


8460 


CCAU6ACAAA 


CAGCAAAGGG 


CAAUCCUGC6 


GCUACAGGCG 


UUGCCGCGCA 


AGCGGKGUUU 


8520 


UCACCACCAG 


CAUGGGGAAU 


ACCAUGACAU 


GUUACAUCAA 


AGCCCUUGCA 


GCGUGUAAGG 


8580 


CUGCRGG6AU 


CGUGGACCCU 


GUUAUGUUGG 


UGUGUGGA6A 


CGACCUGGUC 


GUCAUCUCAG 


8640 


AGAGCCAAGG 


UAACGAGGAG 


GACGAGCGAA 


ACCUGAGAGC 


UUUCACGGAG 


GCUAUGACCA 


8700 


GGUAUUCCGC 


CCCUCCCG6U 


GACCUUCCCA 


GACCGGAAUA 


UGACUUGGAG 


CUUAUAACAU 


8760 


CCU6CUCCUC 


AAACGUAUCG 


GUA6CGCUGG 


ACUCUCGGGG 


UCGCCGCCGG 


UACUUCCUAA 


8820 


CCAGAGACCC 


UACCACUCCA 


AUCACCCGAG 


CUGCUUGGGA 


AACAGUAAGA 


CACUCCCCUG 


8880 


UCAAUUCUUG 


GCUGGGCAAC 


AUCAUCCAGU 


ACGCCCCCAC 


AAUCUGGGUC 


CGGAUGGUCA 


8940 


UAAUGACUCA 


CUUCUUCUCC 


AUACUAUUGG 


CCCA6GACAC 


UCUGAACCAA 


AAUCUCAAUU 


9000 


UUGAGAUGUA 


CGGGGCAGUA 


UACUCGGUCA 


AUCCAUUAGA 


CCUACCG6CC 


AUAAUUGAAA 


9060 


GGCUACAUGG 


GCUUGAAGCC 


UUUUCACUGC 


ACACAUACUC 


UCCCCACGAA 


CUCUCACGGG 


9120 


UGGCAGCAAC 


UCUCA6AAAA 


CUUGGAGCGC 


CUCCCCUUAG 


AGCGUGGAAG 


AGUCGGGCGC 


9180 


GUGCCGUGAG 


AGCUUCACUC 


AUCGCCCAAG 


GAGCGAGGGC 


GGCCAUUUGU 


GGCCGCUACC 


9240 


IJCUUCAACUG 


GGCGGUGAAA 


ACAAA6CUCA 


AACUCACUCC 


AUUGCCCGAG 


GCGAGCCGCC 


9300 


U6GAUUUAUC 


C6GGUGGUUC 


ACCGUGGGCG 


CCGGCGGGG6 


CGACAUUUAU 


CACAGCGUGU 


9360 


C6CAUGCYCG 


ACCCC6CCUA 


UUACUCCUUU 


6CCUACUCCU 


ACUUAGCGUA 


GGAGUAG6CA 


9420 


UCUUUUUACU 


CCCCGCUCGG 


UAGAGCGGCA 


AACYCUAGCU 


ACACUCCAUA 


GCUAGUUUCC 


9480 


GUUUUUUUUU 


UUUUUUUUUU 


UUUUUUUUUU 


U 9511 
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Sequence ID No. 7 
Sequence Length: 9,51 1 
Sequence Type: nucleic acid 
Strandedness: single 
Topology: linear 

Molecule Type: cDNA to genonnic RNA 
Method for Determination of Feature: E 



GCCCGCCCCC 


TGATGGGGGC 


GACACTCCGC CATGAATCAC 


TCCCCTGTGA 


GGAACTACTG 


60 


TCTTCACGCA 


GAAA6CGTCT 


AGCCATG6C6 TTAGTATGAG 


TGTCGTACAG 


CCTCCA6GCC 


120 


CCCCCCTCCC 


GGGAGAGCCA 


TAGT66TCTG CGGAACCGGT 


GAGTACACCG 


GAATTACCGG 


180 


AAAGACTGGG 


TrrTTTfTTfi 




UU 1 UA M 1 UU 






6CAA6ACTGC 


TAGCCGAGTA 


GCGTTGGGTT GCGAAAGGCC 


TTGTGGTACT 


GCCTGATAGG 


300 


6TRCTTGCGA 


GTGCCCCGGG 


AGGTCTCGTA GACCGTGCAT 


CATGAGCACA 


AATCCTAAAC 


360 


CTCAAAGAAA 


AACCAAAAGA 


AACACAAACC GCCGCCCACA 


GGACGTTAAG 


TTCCCGGGTG 


420 


GCGGTCAGAT 


C6TTGGC6GA 


GTTTACTTGC TGCCGCGCA6 


GGGCCCCAGG 


TTGGGTGTGC 


480 


GCGCGACAAG 


GAAGACTTCY 


GAGCGATCCC AGCCGC6TGG 


ACGACGCCA6 


CCCATCCCGA 


540 


AAGATC66CG 


CTCCACCGGC 


AAGTCCTGGG GAAAGCCAGG 


ATATCCTTG6 


CCCCTGTACG 


600 


GAAACGAGGG 


TTGCGGCTGG 


GCGG6TT6GC TCCTGTCCCC 


CCGCGGGTCT 


CGTCCTACTT 


660 


GGGGCCCCAC 


CGACCCCCGG 


CATAGATCAC GCAATTTGGG 


CAGA6TCATC 


GATACCATTA 


720 


CGT6TG6TTT 


TGCCGACCTC 


ATG66GTACA TCCCTGTCGT 


TGGCGCCCCG 


6TYGGAGGCG 


780 


TCGCCAGAGC 


TCTGGCACAC 


G6TGTTAG66 TCCTGGAG6A 


CGGGATAAAT 


TACGCAACAG 


840 


GGAATTTACC 


CG6TTGCTCT 


TTTTCTATCT TTTTGCTTGC 


TCTTCTGTCA 


TGCGTCACAR 


900 


TGCCAGTGTC 


T6CA6TGGAA 


GTCAG6AACA TYA6TTCTA6 


CTACTACGCC 


ACTAATGATT 


960 


GCTCAAACAA 


CA6CATCACC 


TGGCAGCTCA CTGAC6CAGT 


TCTCCATCTT 


CCTGGATGCG 


1020 


TCCCATGTGA 


GAAY6ATAAY 


GGCACCTTGC RTTGCTGGAT 


ACAAGTAACA 


CCCRACGTGG 


1080 


CT6TGAAACA 


CCGC6GTGCG 


CTCACTCGTA 6CCTGCGAAC 


ACACGTCGAC 


ATGATCGTAA 


1140 


TGGCAGCTAC 


GGCCTGCTCG 


GCCTTGTATG T6GGAGATGT 


GTGCGGGGCC 


6TGATGATYC 


1200 
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TATCGCAGGC 


TTTCATGGTA 


TCACCACAAC 


GCCACAACTT 


CACCCAAGAG 


TGCAACTGTT 


1260 


CCATCTACCA 


AGGTCACATC 


ACC6GCCATC 


GCATGGCATG 


GGACATGATG 


CTRARCT6GT 


1320 


CTCCAACTCT 


TRCCATGATC 


CTCGCCTACG 


CYGCTCGYGT 


TCCCGARCTG 


GTCCTCGAAA 


1380 


TYATYTTCGG 


CGGCCATTGG 


66TGTG6YGT 


TYGGCTTGGS 


CTATTTCTCC 


ATGCARGGAG 


1440 


CGT6GGCCAA 


AGTCRTYGCC 


ATCCTCCTTC 


TTGTTGCGGG 


A6TGGATGCA 


WCCACCTATT 


1500 


CCASC6GYCA 


GSAAGCG6GT 


CGTRCCGYCK 


HKG66WTCKC 


TRGCCTCTTT 


AHTACTGGTG 


1560 


CCAAGCAGAA 


CCTCYATTTR 


ATCAACACCA 


ATGGCAGCTG 


GCACATAAAC 


CGGACTGCCC 


1620 


TCAATTGCAA 


TGACAGCYTA 


SAGACGGGTT 


TCHTCGCTTC 


CYTGKTTTAC 


WHCCRCARGT 


1680 


TCAACAGCTC 


TGGCTGCCCC 

1 VIViV i VI WW 


GAGGGCTTGT 


CTTCCTGCCG 


CG6GCTGGAC 


GAYTTYCGCA 


1740 


TCGGCTGGGG 


AACCTTGGAA 


TACGAAACCA 


ACGTCACCAA 


CGATGRGGAC 


ATGAGGCCGT 


1800 


ACTGCTGGCA 


TTACCCCCCG 


AGGCCTTGCG 


GCATC6TCCC 


GGCTAGGACG 


GTTTGCGGAC 


1860 


CGGTCTATTG 


YTTCACCCCT 

III Vl» W W 1 


AGCCCTGTTG 

llVIWX/ 1 VI i 1 Vi 


TCGTGGGCAC 

1 \/ VA i VIvVtvf 1v 


CACTGACAAG 


CAGGGCGTAC 


1920 


CCACCTACAC 


CTGGGGRGAA 


AACGAGACCG 


ATGTCTTCCT 


GCTRAATAGC 


ACAAGACCCC 


1980 


CGCGA66AGC 


TTGGTTCGGC 


TGCACYTGGA 


T6AACGGGAC 


TGGGTTCACT 


AAGACAT6CG 


2040 


6TGCACCACC 


TTGCCGCATT 


AGGAAAGACT 


ACAACAGCAC 


TCTCGATTTA 


TT6TGCCCCA 


2100 


CAGACT6TTT 


TAGGAA6CAC 


CCAGATGCTA 


CCTATCTTAA 


GTGTG6AGCA 


GGGCCTTGGT 


2160 


TAACTCCCAG 


6TGCCTGGTA 

v4 1 viv V 1 uvi 1 r» 


GACTACCCTT 


ATA6RYT6TG 

rv 1 nvi III ■ vi I u 


GCATTATCCG 


TGCACTGTAA 


2220 


ACTTCACCAT 


CTTYAAGGC6 


C6GATGTATG 


TAGGAGGGGT 

1 t * V*^*t 1 w w ^ ■ 


GGAGCATC6A 


TTCTCCGCAG 


2280 


CATGCAACTT 


CAC6CGCGGA 


GATCGCTGCA 


GACTGGAAGA 


TAGGGATAGG 


GGYCAGCAGA 


2340 


GTCCACTGCT 


GCATTCCACT 


ACTGAGTGG6 


CG6TGYTCCC 


ATGCTCCTTC 


TCTGACCTAC 


2400 


CAGCACTATC 


CACTGGCCTA 


TTGCACCTCC 


ACCAAAACAT 


CGTGGACGTG 


CAGTACCTYT 


2460 


HV/UUHV/ 1 1 1 v 


TPrftftrTPTft 

1 vV/UUl> 1 1> 1 u 




1 l#u 1 vlAHU lU 


UUHu 1 UUU 1 U 


ATmrrTTT 

n 1 V#0 1 V/ir 1 1 1 




TCTT6TTGTT 


GQCAGACGCC 


A6GRTCT6T6 


CATGCCTTTG 


GATGCTCAWC 


ATACTGGGCC 


2580 


AAGCCGAAGC 


GGCGCTTGAG 


AAGCTCATCA 


TCTTGCACTC 


CGCTAGYGCT 


GCTAGTGCCA 


2640 


ATGGTCC6CT 


GTGGTTTTTC 


ATCTTCTTTA 


CAGCGGCCTG 


6TACTTAAAG 


GGCAGGGTGG 


2700 


TCCCCGTGGC 


CACGTACTCT 


6TBCTCGGCT 


TRTGGTCCTT 


CCTCGTCCTA 


GTCCTGGCYT 


2760 


TACCACAGCA 


6GCTTATGCC 


TTGGACGCTG 


CTGAACAAGG 


GGAACTGGGG 


CTGGCCATAT 


2820 


TAGTAATTAT 


ATCCATCTTT 


ACTCTTACCC 


CAGCATACAA 


GATCCTCCTG 


A6CC6TTCAG 


2880 


T6TGGTGGCT 


6TCCTACATG 


CTGGTCTTGG 


CC6A6GCCCA 


GATTCAGCAA 


TGGGTTCCCC 


2940 
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CCCTGGAGGT 


CCGAGGGGGG 


CGTGACGGGA 


TCATCTGGGT 


GGCTGTCATT 


CTACACCCAC 


3000 


GCCTTGT6TT 


TGA6GTCACG 


AAATGGTTGT 


TAGCAATCCT 


GGGGCCTGCC 


TACCTCCTTA 


3060 


RAGC6TCTCT 


GCTACGGATA 


CC6TACTTTG 


TGAGGGCCCA 


CGCTTTGCTA 


CGAGTGTGTA 


3120 


CCCTGGT6AA 


ACACCTCGCR 


GGGGCTA6GT 


ACATCCAGAT 


GCTGTTRATC 


ACCATAGGCA 


3180 


GATGGACCG6 


CACTTACATC 


TACGACCACC 


TCTCCCCTTT 


ATCAACTT6G 


GCGGCCCAGG 


3240 


GTTTRCG6GA 


CCTGGCAATC 


GCCGTGGAGC 


CTGTGGTGTT 

V 1 vf 1 VIWI i VI 1 1 


CAGCCCAAT6 


GAGAAGAAGG 

1*1 1 **l 11 «Vf 1 tf 1 Vf VI 


3300 


TCATTGTGTG 


GGGGGCT6A6 


ACAGTGGCGT 


GTGGAGACAT 


CCTGCATGGC 


CTCCCGGTCT 

V I W Wf VI 1 V i 


3360 


CCGCGAGGCT 


AGGTAG6GAR 


GTTCTGCTCG 

vi 1 1 V i XJv 1 W 


GCCCTGCCGA 

viv w 1 VI wviri 


CG6CTACACC 


TCCAAGGGGT 

1 V vf 1 i tVf VI Vf VI 1 


3420 


GGAAKCTCCT 


A6CTCCCATT 


ACTGCTTACA 


CTCAGCAAAC 


TCGTGGTCTC 


CTGGGTGCTA 

V 1 viwi 1 VIV 1 n 


3480 


TCGTGGTCAG 


CCTAACGGGC 


CGCGACAAAA 


ATGAGCAGGC 


TGGGCAGGTC 

I Vt^aAav 11 Vf VI 1 


CAGGTTCTGT 

\/f 1VIX4 i 1 V 1 1 


3540 


CCTCCGTCAC 


ACAAACTTTC 


TTG6GGACAT 


CCATTTCGGG 


CGTCCTCTGG 

\/ Vf 1 A/ v 1 v 1 VI VI 


ACAGTATATC 


3600 


AC6GG6CTGG 


TAATAA6ACC 

1 r\r\ 1 r^nvinvv 


TTG6CCGGCC 


CCAAGGGACC 


AGTCACTCA6 

■ tVI I V/IIV i UitVI 


AT6TACACCA 

n 1 VI 1 nvnwn 


3660 


GCGCAGAAGG 


GGACCTCGT6 


GGATGGCCTA 

u win i vivfw/ 1 It 


GTCCCCCCGG 


GACTAAGTCA 

^Jff iVr 1 lill^B i \^f 1 


TTGGACCCCT 

1 i VIVff >v vVf V 1 


3720 


GTACCTGCGG 


GGCCGTAGAC 


CTCTACCTGG 


TCACCCGAAA 


CGCTGATGTC 

%/V4V/ i \Jfi 1 VI f Vf 


ATTCCGGTCC 

r\ i 1 V v Vf VI f V w 


3780 


GGAGGAAAGA 


TGACCGACGG 


G6TGCATTAC 


TCTCGCCAAG 


GCCCCTCTCA 

VI WW 1 w 1 vn 


ACCCTCAAAG 


3840 


GATCATCCGG 


AGGGCCCGTG 


CTCTGCTCWA 


GGGGACACGC 


CGTGGGCTTG 


TTCAGAGCGG 


3900 


CCGTGTGTGC 


CAGGGGTGTA 


GCCAAATCTA 


TTGACTTCAT 

1 1 urtv 1 i Vri* 1 


CCCCGTCGAA 

vr vvfvrvi 1 vvinn 


TCACTCGATR 

1 \^iiv 1 wir« 1 11 


3960 


TCGCCACAC6 


GACGCCCAGT 


TTCTCTGACA 


ACAGTRCGCC 


GCCAGCTGTG 


CCCCAGTCTT 


4020 


ACCAGGTGG6 


TTACTTGCAC 


GCACCAACAG 


GCAGCGGAAA 


GAGCACCAAG 


GTCCCTGCCG 


4080 


CGTATGCCA6 


TCAGGGGTAT 


AAAGTACTCG 


TACTAAATCC 


CTCTGTCGCG 


GCCACACTT6 


4140 


GTTTT6GGGC 


CTACATGTCC 


AAAGCCCACG 


GGATCAACCC 

ViVil t 1 VIlllVW 


TAATATCAGA 

Ifllllivi 1 Vf 1 1 


ACTGGAGT6C 


4200 






1 V/i All/Air 1 1 


AtlLtAi/l lA 


jrrC'h A^TTT 


Air/^f^A/iAT/i 




GAGGCTGTGC 


A6CC6GTGCC 


TATGACATCA 


TCATATGCGA 


CGAATGCCAT 


TCA6TGGACG 


4320 


CTACTACCAT 


CCTTGGCATT 


GGAACAGTCC 


TT6ACCAAGC 


T6AGACCGCA 


GGCGTCA6GC 


4380 


TAGT6GTYTT 


GGCCACAGCC 


ACGCCTCCCG 


GTACGGTGAC 


AACTCCCCAC 


AGTAACATAG 


4440 


AGGAGGTGGC 


CCTTG6TCAC 


GAGGGCGAGA 


TCCCTTTTTA 


TGGCAAAGCT 


ATTCCCCTAG 


4500 


CTTTCATCAA 


GGGGGGCAGA 


CACTTGATCT 


TTTGCCATTC 


AAAGAAGAAG 


TGC6ACGAGC 


4560 


TCGCAGC6GC 


CCTCCGGGGC 


AY6GGTGTCA 


ATGCCGTTGC 


ATACTATAGG 


GGTCTCGAC6 


4620 


TCTCCGTTAT 


ACCAACTCAA 


GGAGACGTGG 


TGGTTGTCGC 


CACTGAT6CC 


CTAATGACTG 


4680 
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GGTACACCGG 


CGACTTTGAC 


TCY6TCATCG 


ACTGTAATGT 


TGCAGTCTCT 


CAGATTGTTG 


4740 


ACTTCAGCCT 


A6ACCCAACC 


TTCACCATCA 


CCACTCAAAC 


CGTCCCTCA6 


GACGCT6TCT 


4800 


CCCGTAGTCA 


ACGTAGAGGG 


AGAACTG6GA 


GGGGGCGATT 


GGGCRTTTAC 


AGGTATGTTT 


4860 


CGTCAGGYGA 


RRGGCCGTCT 


GGGATGTTCG 


ACAGCGTAGT 


GCYCTGCGAG 


TGCTATGATG 


4920 


CCGGGGCAGC 


CTG6TACGAG 


CTTACACCTG 


CTGA6ACTAC 


GGTGAGACTC 


CGGGCYTATT 


4980 


TCAACACGCC 


CGGTTTGCCC 


GTATGTCAAG 


ACCACCTGGA 


GTTCTG6GAA 


GCGGTCTTTA 


5040 


CA6GTCTCAC 


WCACATTRAC 


GCCCACTTCC 


TCTCCCAGAC 


GAAGCAAGGA 


GGAGAAAACT 


5100 


TTGCRTATCT 


AAC6GCCTAC 


CAGGCCACAG 


TAT6CGCCAG 


GGCAAAGGCC 


CCTCCTCGTT 


5160 


CGTGGGAC6T 


GATGTGGAAG 


TGTCTAACTA 


GGCTCAAACC 


TACACT6ACT 


GGTCCCACCC 


5220 


CCCTCCTGTA 


CCGCTTGGGT 


GCCGTGACCA 


ATGAGGTYAC 


CTTGACGCAC 


CCCGT6ACGA 


5280 


AATACATCGC 


CAC6TGCATG 


CAA6CTGACC 


TYGA6ATCAT 


GACAAGCTCA 


TGGGTCCTGG 


5340 


CGGGGGGGGT 


6CTAGCCGCC 


GTGGCAGCTT 


ACTGCCTGGC 


GACTGGCTGC 


ATTTCCATCA 


5400 


TTGGCCGCCT 


ACACCTGAAT 


GATCGG6TGG 


TTGTGRCCCC 


Y6ACAAGGAR 


ATCTTATATG 


5460 


AGGCCTTTGA 


TGAGATGGAA 


GAATGCGCCT 


CCAAAGCCGC 


CCTCATTGA6 


GAAGGGCAGC 


5520 


GGATGGCG6A 


GATGCTCAAA 


TCTAAGATAC 


AAGGCCTCCT 


ACAACAGGCC 


ACAAGGCAAG 


5580 


CTCAA6RCAT 


RCAGCCAGCT 


ATACAGTCAT 


CATGGCCCAA 


GCTTGAACAA 


TTTTGGGCCA 


5640 


AACACATGT6 


6AACTTCATC 


AGTGGTATAC 


AGTACCTAGC 


AGGACTCTCC 


ACCCTACCGG 


5700 


GAAATCCT6C 


A6TR6CATCA 


ATGAT6GCTT 


TTAGCGCCGC 


GCTGACTAGC 


CCACTACCCA 


5760 


CCAGCACCAC 


CATCCTCTTG 


AACATCAT6G 


GAGGATGCTT 


GGCCTCYCAG 


ATT6CCCCCC 


5820 


CTGCCGGAGC 


CACYG6CTTC 


6TTGTCAGT6 


GTCTAGTGGG 


G6CG6CCGTC 


GGAAGCATAG 


5880 


GCCTGGGTAA 


GATACTGGTG 


GACGTTTTGG 


CCGGGTACGG 


CGCAGGCATT 


TCAGGGGCCC 


5940 


TCGTAGCTTT 


TAAGATCATG 


AGCGGP6AGA 


nUl/l/UnV/UU 1 


AftAAfiArCTT 

nUnnUnl/U 1 1 


ftTGAATrTPr 

U 1 unn 1 \f 1 


Dvuu 


TGCCTGCTAT 


YCTGTCTCCT 


GGTGCGYTGG 


TAGTGGGAGT 


CATCTGTGCA 


GCAATYCTGC 


6060 


GCCGCCACGT 


CGGTCAGGGA 


GAGGGRGCGG 


TCCA6TGGAT 


GAACAGACTG 


ATCGCCTTCG 


6120 


CCTCCAGG6G 


AAACCACGTT 


GCCCCTACCC 


ACTACGTGGT 


GGAGTCTGAC 


GCTTCACAGC 


6180 


GTGTRACGCA 


GGTGCTGAGT 


TCACTTACAA 


TTACCAGCTT 


ACTTAGGAGA 


CTACATGCCT 


6240 


GGATCACTGA 


AGATTGCCCA 


RTCCCATGCT 


CG6GGTCTTG 


GCTCCAG6AC 


ATTTGGGATT 


6300^ 


GGGTTTGTTC 


CATCCTCACA 


GACTTYAAAA 


ACT6GCTGTC 


TTCAAAATTA 


CTCCCCAAGA 


6360 


TGCCCGGCAT 


TCCCTTTATC 


TCTTGCCA6A 


AG6GATACAA 


6GGTGTATGG 


GCTGGTACGG 


5420 
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1 HI/ 1 UUiV 1 flu 


IfLrrt 1 U 1 UuHu 


rAAATATr'Tr 


uuuUlfA f u I L 


LULA 1 UUUlrA 


046'-' 


rCATfiAAAAT 


AATAGGrrPG 

nn vn uU Ir V/ \/ U 


AAGAriTGn 


TGAArrTGTG 

1 uAAk/^ lulu 


uV/AuUuuAlr 1 


1 I v/^l/liA 1 1 A 




ATTRTTArAT 


MUHMUuUvV/ I 


TGOGTGOOAA 


AAl/Ul/lfV/ 1 


TAATTArAAG 

1 AA 1 1 AV/AAu 


ArrGfAATTT 

A^V#UV/AAI 1 1 


ODUv 






TArGTTGAGG 


TfArAPAGPA 

1 V/rtv/rt^/HUV/A 


1 Uutf 1 u 1 1 1 If 


TrGTATGTAA 

1 \f\2 1 A 1 U I AA 


000 a 


rRGGGTTAAr 


rAGTGAOAAr 


riTAAGGTVr 


If 1 1 uV/vnUu 1 


ArrAGnrrA 


GAATTTTirT 

UAAI 1 1 I 1 ^ 1 




riTGGGTGftA 


PGAftGTGrAA 


ATf^PArr^AT 
rt 1 l/l/Ml/V/uH 1 


1 IfUlfl/Ul/l/U 1 


nl/l/AuU 1 V/l/V 


M Ul 1 1 uuuu 


Of oU 


ATGAGGTAAC 


GTTrfirrGTA 


GGrriTAAPT 

UUl/V/ 1 1 MAV/ 1 


UU 1 1 Uu luu 1 ■ 


IfUUlf 1 V/ 1 UAU 


Kf 1 \f\f\j 1 1 UUU 


AAyl A 
004v 


AircTGAGrr 


GGAr'ArrGAR 


U 1 HI/ 1 Uu^l/ 1 


U 1 A 1 U 1 1 UAV/ 


AGAPrrGT^f^ 
AUAl/V/uU I l/V/ 


TAPATrArrfi 


oyuv 


rKGAGGPGGr 

U l\un u uV/ Uu If 


AGrrAHGr^A 

MUvVfMUUV/UH 


1 lUUlrHAUUU 


uA 1 U 1 UUUUU 


VTrATAGfirT 
T 1 i/AUAuuU 1 


AUUIUU lUAU 


oyoi/ 


TGAGrrAGrT 




TmiGAAGG 
1 i/V/ 1 1 UHHUU 


Lf 1 Al/lf 1 U 1 Alf 


PAf^rrATAAG 
l/Al/V/wA 1 AAU 


Af'AGrATATG 
AUAUOA 1 A 1 U 


7A9A 


ATTGTGArAT 


GGTGGATGrV 
UU 1 Uun 1 uvT 


AArPTTTTPA 
MHOl/ Mil LfM 


1 UuUAUUnuA 


1 U 1 UA I l/lrUU 


A 1 1 UAU 1 1# 1 U 


/ vol/ 


APTrTAflGftT 


UH 1 K^u 1 1 U 1 H 




ATirfATGAr 
A 1 1 V/k/A 1 UAl/ 


TGA^iftTA^Aft 
1 uAuu 1 AuAU 


UA 1 UAI UU 1 U 


71 yl A 


AGPrTTriGT 

nU\^U 1 1 V 1 u 1 


ArrAirAGAG 


TA^^rT^ATPA 


Ar^AGf^AGA AA 
AUAuuAUAAA 


u 1 1 l/U^/ALlf U 


UUUlf 1 Ul/l/ llf 


79AA 


OTTGGGrrrG 


TrrAGAPTAr 


AATrrTGTTT 

AM 1 V/V/ 1 U 1 1 1 


THAir^GAGAr 
1 uA i V/UAuAU 


ATGGAAGAGG 

A I uUAAuAUvl 


UOUUUU 1 A 1 U 


79AA 


AArrArrrAr 

n n V/ \/ n \y n 1/ 


1 u 1 If If 1 MUUV/ 


1 U 1 uV/V/lf 1 Ul/ 


rrrrrAPATv 

l#l/Ll/UAl/Av/T 


TrAAATG^PA 
1 If AAAl/UV/UA 


U 1 UV/U 1 UUAU 


7 '59 A 


rirGGAfiGOG 


OrGPGrVAAA 


RTrriGAPrr 

l\ 1 vk/ 1 UAV/^V/ 


AGGATRATGI 
AuuAVfr\A 1 U 1 


GGAGGGGRTf 

uUAUUUuil 1 If 


rirAHAGAGA 
Kf 1 UAUUUAUA 


7'^AA 


TGGriGArAA 


AGTRnrAGP 


1/1/ 1 1/ 1 k/^lf AAu 


AfAAf^AAT^iA 
Al/AAlrAA 1 uA 


1/ 1 l/V^UU 1 l/Alf 


1 l/l/Al/ 1 UUAU 


i 44y 


OGGATArOGG 


AGGAGATATr 


U 1 UlrAU\/AAVy 


V/U lU I uAUuA 


UAU 1 UlrV/UU 1 


lUAUAAUUUU 


f jV/v 


GGTCArTGTr 


V/ 1 l/l/n 1 UV/V/ 1 


rrrriTGAGG 

v/l/lfO 1 1 UHuU 


UAUAUOV/UUU 


ARArrrvftAT 

AuAl/lrl/ 1 UAv 


U 1 UUAU M 1 U 


7RAA 
/ OOv 


AACCAGTGGG 


ATroGrirrr 

n 1 V/V/UV/ 1 V/l/^ 


V/U 1 1 ^ 1 UAUU 


uuuAu 1 U 1 UA 


UU 1 UA 1 1 UA 1 


1 UUUAU lU lA 


7A9A 


AGTCGTGGTr 

nU 1 VU 1 UU 1 Vf 


rAOAGTrirT 


GATf^AAftAfi/i 

UA 1 uAAUAuU 


At 1 V/ 1 u 1 1 Al 


U 1 UU 1 UU 1 U 1 


AT^ITPATAPT 
A 1 Ul UA 1 AUI 




UU 1 uuALbub 


(lUl/CC 1 CA 1 A 


A A A A T AT/^ 

ACACCATGT6 


GGCCCGAAGA 


GGAGAAGTTA 


/\/\ /\ A T/\ « A /\/\ 

CCGATCAACC 


7740 


CTCTGAGTAA 


TTCGCTCATG 


CGGTTCCATA 


AYAA6GTGTA 


CTCCACAACC 


TCGAGGAGTG 


7800 


CCTCTCTGAG 


GGCAAAGAAG 


GTGACTTTTG 


ACAGGGTGCA 


GGTGCTGGAC 


GCACACTAT6 


7860 


ACTCAGTCTT 


GCAGGACGTT 


AAGCGG6CCG 


CCTCTAAGGT 


TRGTGCGAGG 


CTCCTCACAG 


7920 


TAGAGGAAGC 


CTGCGCGCTG 


ACCCCGCCCC 


ACTCCGCCAA 


ATC6CGATAC 


G6ATTTGGGG 


7980 


CAAAAGAGGT 


GCGCAGCTTA 


TCCAGGAGGG 


CCGTTAACCA 


CATCCG6TCC 


GTGTGGGAGG 


8040 


ACCTCCTG6A 


AGACCAACRT 


ACCCCAATTG 


ACACAACTAT 


CATGGCTAAA 


AATGAG6T6T 


8100 


TCTGCATTGA 


TCCAACTAAR 


GGTGGGAAAA 


AGCCAGCTCG 


CCTCATCGTA 


TACCCC6ACC 


8160 
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TTGGGGTCAG 


GGTGTGCGAA 


AAGATGGCCC 


TCTATGACAT 


CRCACAAAAG 


CTTCCCAAAG 


8220 


CGATAATGGG 


GCCATCCTAT 


GGGTTCCAAT 


ACTCTCCC6C 


AGAACGGGTC 


GATTTCCTCC 


8280 


TCAAAGCTTG 


GGGAAGTAAG 


AAGGACCCAA 


TGGG6TTCTC 


GTATGACACC 


CGCTGCTTT6 


8340 


ACTCAACCGT 


CACGGAGAGG 


GACATAA6AA 


CAGAAGAATC 


CATATATCAG 


GCTTGTTCTC 


8400 


TGCCTCAAGA 


A6CCAGAACT 


6TCATACACT 


CGCTCACTGA 


GA6ACTTTAC 


GTAGGAGGGC 


8460 


CCATGACAAA 


CAGCAAAGGG 


CAATCCTGCG 


GCTACAG6CG 


TTGCCGCGCA 


AGCGGKGTTT 


8520 


TCACCACCAG 


CATGGGGAAT 


ACCATGACAT 


GTTACATCAA 


AGCCCTTGCA 


GCGTGTAAGG 


8580 


CTGCRGGGAT 


CGTG6ACCCT 


6TTATGTT6G 


TGTGTGGAGA 


CGACCTG6TC 


GTCATCTCAG 


8640 


AGAGCCAAGG 


TAACGAGGAG 


GACGAGCGAA 


ACCTGAGAGC 


TTTCACGGAG 


GCTATGACCA 


8700 


GGTATTCCGC 


CCCTCCCGGT 


GACCTTCCCA 


GACCGGAATA 


TGACTTGGA6 


CTTATAACAT 


8760 


CCTGCTCCTC 


AAACGTATC6 


GTAGCGCTGG 


ACTCTC66GG 


TC6CCGCCGG 


TACTTCCTAA 


8820 


CCAGAGACCC 


TACCACTCCA 


ATCACCC6AG 


CTGCTTGGGA 


AACAGTAA6A 


CACTCCCCTG 


8880 


TCAATTCTTG 


GCTGG6CAAC 


ATCATCCAGT 


ACGCCCCCAC 


AATCTGGGTC 


CGGATGGTCA 


8940 


TAAT6ACTCA 


CTTCTTCTCC 


ATACTATTGG 


CCCAGGACAC 


TCTGAACCAA 


AATCTCAATT 


9000 


TTGAGATGTA 


CGGGGCAGTA 


TACTCGGTCA 


ATCCATTAGA 


CCTACCGGCC 


ATAATTGAAA 


9060 


6GCTACATGG 


GCTTGAAGCC 


TTTTCACTGC 


ACACATACTC 


TCCCCACGAA 


CTCTCACGGG 


9120 


T66CAGCAAC 


TCTCA6AAAA 


CTT66AGCGC 


CTCCCCTTAG 


AGCGTGGAAG 


AGTCGGGCGC 


9180 


6TGCCGT6AG 


AGCTTCACTC 


ATCGCCCAAG 


GAGCGAGGGC 


GGCCATTT6T 


GGCCGCTACC 


9240 


TCTTCAACTG 


6GCGGTGAAA 


ACAAAGCTCA 


AACTCACTCC 


ATTGCCCGAG 


GCGAGCCGCC 


9300 


TGGATTTATC 


CGG6T6GTTC 


ACCGTGGGCG 


CCGGCG6GGG 


C6ACATTTAT 


CACAGCGT6T 


9360 


C6CATGCYCG 


ACCCCGCCTA 


TTACTCCTTT 


GCCTACTCCT 


ACTTAGCGTA 


G6A6TAG6CA 


9420 


TCTTTTTACT 


CCCCGCTCGQ 


TAGAGCGGCA 


AACYCTAGCT 


ACACTCCATA 


GCTAGTTTCC 


9480 


GTTTTTTTTT 


TTTTTTTTTT 


TTTTTTTTTT 


T 9511 
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Sequence ID No. 3 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Het Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 

5 10 15 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie 

20 25 30 

Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 

35 40 45 

Val Arg Ala Thr Arg Lys Thr Ser Giu Arg Ser Gin Pro Arg Gly 

50 55 60 

Arg Arg Gin Pro He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser 

65 70 75 

Trp Gly Lys Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 

80 85 90 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro 

95 100 105 

Thr Trp Gly Pro Thr Asp Pro Arg His Arg Ser Arg Asn Leu Gly 

110 115 120 

Arg Val He Asp Thr He Thr Cys Gly Phe Ala Asp Leu Het Gly 

125 130 135 

Tyr He Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 

140 145 150 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala 

155 160 165 
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Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Val Pro Val Ser Ala Val Glu Val Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser lie Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu His Cys Trp He 

230 235 240 

Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Met He Val Met Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Ser Trp Ser Pro Thr Leu 

320 325 330 

Thr Het He Leu Aia Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Val Phe Gly Leu Ala 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val He Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Thr Thr Tyr Ser Ser Gly Gin 
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Glu Ala Gly 
Gly Ala Lys 
His He Asn 
Gly Phe Leu 
Gly Cys Pro 
Arg He Gly 
Asp Gly Asp 
Cys Gly He 
Phe Thr Pro 
Val Pro Thr 
Leu Asn Ser 
Trp Met Asn 
Cys Arg He 
Pro Thr Asp 



380 
Arg Thr Val 

395 
Gin Asn Leu 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Leu 

455 
Trp Gly Thr 

470 
Met Arg Pro 

485 
Val Pro Ala 

500 
Ser Pro Val 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 
Gly Thr Gly 

560 
Arg Lys Asp 

575 
Cys Phe Arg 

590 



Ala Gly Phe 
Tyr Leu He 
Leu Asn Cys 
Phe Tyr Thr 
Ser Ser Cys 
Leu Glu Tyr 
Tyr Cys Trp 
Arg Thr Val 
Val Val Gly 
Gly Glu Asn 
Pro Arg Gly 
Phe Thr Lys 
Tyr Asn Ser 
Lys His Pro 



385 

Ala Gly Leu 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

His Lys Phe 
445 

Arg Gly Leu 
460 

Glu Thr Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 
535 

Ala Trp Phe 
550 

Thr Cys Gly 
565 

Thr He Asp 
580 

Asp Ala Thr 
595 



390 

Phe Thr Thr 
405 

Gly Ser Trp 
420 

Leu Gin Thr 
435 

Asn Ser Ser 
450 

Asp Asp Phe 
465 

Val Thr Asn 
480 

Pro Arg Pro 
495 

Val Tyr Cys 
510 

Lys Gin Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 

555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr Leu Lys 
600 
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Cys Gly Ala Gly Pro Tro Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Leu Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr He Val Lys Trp 

710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He 

725 730 735 

Cys Ala Cys Leu Trp Met Leu He He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 



68 



EP 0 532 167 A2 

I 



815 820 825 

He Leu Val He He Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly Gly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Lys Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Het Leu Leu He Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 9 75 

Glu Pro Val Val Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Lys Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Thr Gin Gin Thr Arg Gly Leu Leu Gly Ala He Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Het Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys, Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp Val Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala 

1265 1270 1275 

His Gly lie Asn Pro Asn Me Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Met Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

14l5 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

1490 1495 1500 

Gly Val Tyr Arg Tyr Val Ser Ser Gly Glu Arg Pro Ser Gly Het 

1505 1510 1515 

Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asp Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Het Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Het Gin Ala Asp Leu Glu He Het Thr Ser Ser Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Ala Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Hec Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg Hei Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Asp He Gin Fro Ala He Gin 

1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Het Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Met Ser Gly 

1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Vai Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Qly Ala Val Gin Trp Het 

1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Sen Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp He Thr Glu Asp Cys Pro Val Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Het Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Het Thr Thr Arg Cys Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Het Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 
Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys Val Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 
Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 
Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 



74 



EP 0 532 167 A2 



2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Met Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Met Gly Gly Asp Val Thr Arg lie Glu Ser Asp Ser Lys Val 

2240 2245 2250 

lie Val Leu Asp Ser Leu Asp Ser Met Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Pro Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys Val Leu Thr Gin Asp Asn Val 

2330 2335 2340 
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Glu Gly Val Leu Arg Glu Het Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

GIfi Asp Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Ser Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin His Thr Pro He Asp Thr Thr lie 

2570 2575 2580 

Met Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu He Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 

Val Cys Glu Lys Het Ala Leu Tyr Asp He Ala Gin Lys Leu Pro 

2615 2620 2625 

Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu 

2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Met 

2855 2860 2865 

Val He Het Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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2990 2995 3000 

Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Leu Leu Leu 

3005 3010 3015 

Leu Cys Leu Leu Leu Leu Ser Val Gly Val Gly lie Phe Leu Leu 

3020 3025 3030 

Pro Ala Arg 
3033 
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Sequence ID No. ^ 
Sequence Length: 3,033 
Sequence Type: amino acid 
Topology: linear 
Molecule Type: protein 



Met Ser Thr 
Asn Arg Arg 
Val Gly Gly 
Vai Arg Ala 
Arg Arg Gin 
Trp Gly Lys 
Cys Gly Trp 
Thr Trp Gly 
Arg Val lie 
Tyr lie Pro 
leu Ala His 



Asn Pro Lys 
5 

Pro Gin Asp 
20 

Val Tyr Leu 

35 

Thr Arg Lys 
50 

Pro He Pro 
65 

Pro Gly Tyr 
80 

Ala Gly Trp 
95 

Pro Thr Asp 

110 
Asp Thr He 

125 

Val Val Gly 

140 
Gly Val Arg 

155 



Pro Gin Arg 
Val Lys Phe 
Leu Pro Arg 
Thr Ser Glu 
Lys Asp Arg 
Pro Trp Pro 
Leu Leu Ser 
Pro Arg His 
Thr Cys Gly 
Ala Pro Val 
Vai Leu Glu 



Lys Thr Lys 
10 

Pro Gly Gly 

25 

Arg Gly Pro 
40 

Arg Ser Gin 

55 

Arg Ser Thr 
70 

Leu Tyr Gly 

85 

Pro Arg Gly 
100 

Arg Ser Arg 
115 

Phe Ala Asp 
130 

Gly Gly Val 
145 

Asp Gly He 
160 



Arg Asn Thr 
15 

Gly Gin He 
30 

Arg Leu Gly 
45 

Pro Arg Gly 
60 

Gly Lys Ser 
75 

Asn Glu Gly 
90 

Ser Arg Pro 
105 

Asn Leu Gly 
120 

Leu Hel Gly 

135 

Ala Arg Ala 
150 

Asn Tyr Ala 
165 



80 



EP 0 532 167 A2 



Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 

170 175 180 

Leu Leu Ser Cys Val Thr Met Pro Val Ser Ala Val Glu Val Arg 

185 190 195 

Asn He Ser Ser Ser Tyr Tyr Ala Thr Asn Asp Cys Ser Asn Asn 

200 205 210 

Ser He Thr Trp Gin Leu Thr Asp Ala Val Leu His Leu Pro Gly 

215 220 225 

Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu Arg Cys Trp He 

230 235 240 

Gin Val Thr Pro Asp Val Ala Val Lys His Arg Gly Ala Leu Thr 

245 250 255 

Arg Ser Leu Arg Thr His Val Asp Het He Val Met Ala Ala Thr 

260 265 270 

Ala Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Het 

275 280 285 

He Leu Ser Gin Ala Phe Het Val Ser Pro Gin Arg His Asn Phe 

290 295 300 

Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly His He Thr Gly 

305 310 315 

His Arg Het Ala Trp Asp Het Het Leu Asn Trp Ser Pro Thr Leu 

320 325 330 

Ala Het He Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu Val Leu 

335 340 345 

Glu He He Phe Gly Gly His Trp Gly Val Ala Phe Gly Leu Gly 

350 355 360 

Tyr Phe Ser Het Gin Gly Ala Trp Ala Lys Val Val Ala He Leu 

365 370 375 

Leu Leu Val Ala Gly Val Asp Ala Ser Thr Tyr Ser Thr Gly Gin 
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Gin Ala Gly 
Gly Ala Lys 
His He Asn 
Gly Phe He 
Gly Cys Pro 
Arg lie Gly 
Asp Glu Asp 
Cys Gly He 
Phe Thr Pro 
Val Pro Thr 
Leu Asn Ser 
Trp Met Asn 
Cys Arg He 
Pro Thr Asp 



380 
Arg Ala Ala 

395 
Gin Asn Leu 

410 
Arg Thr Ala 

425 
Ala Ser Leu 

440 
Glu Arg Leu 

455 

Trp Gly Thr 

470 
Het Arg Pro 

485 
Val Pro Ala 

500 
Ser Pro Val 

515 
Tyr Thr Trp 

530 
Thr Arg Pro 

545 

Gly Thr Gly 

560 
Arg Lys Asp 

575 
Cys Phe Arg 

590 



Tyr Gly He 
His Leu He 
Leu Asn Cys 
Val Tyr Tyr 
Ser Ser Cys 
Leu Glu Tyr 
Tyr Cys Trp 
Arg Thr Val 
Val Val Gly 
Gly Glu Asn 
Pro Arg Gly 
Phe Thr Lys 
Tyr Asn Ser 
Lys His Pro 



385 

Ser Ser Leu 
400 

Asn Thr Asn 
415 

Asn Asp Ser 
430 

Arg Arg Phe 
445 

Arg Gly Leu 
460 

Glu Thr Asn 
475 

His Tyr Pro 
490 

Cys Gly Pro 
505 

Thr Thr Asp 
520 

Glu Thr Asp 

535 

Ala Trp Phe 
550 

Thr Cys Gly 
565 

Thr He Asp 
580 

Asp Ala Thr 
595 



390 

Phe Asn Thr 
405 

Gly Ser Trp 
420 

Leu Glu Thr 
435 

Asn Ser Ser 
450 

Asp Asp Phe 
465 

Val Thr Asn 
480 

Pro Arg Pro 
495 

Val Tyr Cys 
510 

Lys Gin Gly 
525 

Val Phe Leu 
540 

Gly Cys Thr 
555 

Ala Pro Pro 
570 

Leu Leu Cys 
585 

Tyr Leu Lys 
600 
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Cys Gly Ala Gly Pro Trp Leu Thr Pro Arg Cys Leu Val Asp Tyr 

605 610 615 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

620 625 630 

Phe Lys Ala Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Ser 

635 640 645 

Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys Arg Leu Glu Asp 

650 655 660 

Arg Asp Arg Gly Gin Gin Ser Pro Leu Leu His Ser Thr Thr Glu 

665 670 675 

Trp Ala Val Phe Pro Cys Ser Phe Ser Asp Leu Pro Ala Leu Ser 

680 685 690 

Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin Tyr 

695 700 705 

Leu Tyr Gly Leu Ser Pro Ala Leu Thr Arg Tyr lie Val Lys Trp 

710 715 720 

Glu Trp Val lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 

725 730 735 

Cys Ala Cys Leu Trp Met Leu Asn He Leu Gly Gin Ala Glu Ala 

740 745 750 

Ala Leu Glu Lys Leu He He Leu His Ser Ala Ser Ala Ala Ser 

755 760 765 

Ala Asn Gly Pro Leu Trp Phe Phe He Phe Phe Thr Ala Ala Trp 

770 775 780 

Tyr Leu Lys Gly Arg Val Val Pro Val Ala Thr Tyr Ser Val Leu 

785 790 795 

Gly Leu Trp Ser Phe Leu Leu Leu Val Leu Ala Leu Pro Gin Gin 

800 805 810 

Ala Tyr Ala Leu Asp Ala Ala Glu Gin Gly Glu Leu Gly Leu Ala 
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815 820 825 

He Leu Val He He Ser He Phe Thr Leu Thr Pro Ala Tyr Lys 

830 835 840 

He Leu Leu Ser Arg Ser Val Trp Trp Leu Ser Tyr Met Leu Val 

845 850 855 

Leu Ala Glu Ala Gin He Gin Gin Trp Val Pro Pro Leu Glu Val 

860 865 870 

Arg Gly 6ly Arg Asp Gly He He Trp Val Ala Val He Leu His 

875 880 885 

Pro Arg Leu Val Phe Glu Val Thr Lys Trp Leu Leu Ala He Leu 

890 895 900 

Gly Pro Ala Tyr Leu Leu Arg Ala Ser Leu Leu Arg He Pro Tyr 

905 910 915 

Phe Val Arg Ala His Ala Leu Leu Arg Val Cys Thr Leu Val Lys 

920 925 930 

His Leu Ala Gly Ala Arg Tyr He Gin Met Leu Leu He Thr He 

935 940 945 

Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Ser Pro Leu 

950 955 960 

Ser Thr Trp Ala Ala Gin Gly Leu Arg Asp Leu Ala He Ala Val 

965 970 975 

Glu Pro Val Val Phe Ser Pro Het Glu Lys Lys Val He Val Trp 

980 985 990 

Gly Ala Glu Thr Val Ala Cys Gly Asp He Leu His Gly Leu Pro 

995 1000 1005 

Val Ser Ala Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp 

1010 1015 1020 

Gly Tyr Thr Ser Lys Gly Trp Asn Leu Leu Ala Pro He Thr Ala 

1025 1030 1035 
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Tyr Thr Gin Gin Thr Arg Gly Leu Leu Gly Ala He Val Val Ser 

1040 1045 1050 

Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin Val Gin Val 

1055 1060 1065 

Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He Ser Gly 

1070 1075 1080 

Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu Ala 

1085 1090 1095 

Gly Pro Lys Gly Pro Val Thr Gin Het Tyr Thr Ser Ala Glu Gly 

1100 1105 1110 

Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp 

1115 1120 1125 

Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn 

1130 1135 1140 

Ala Asp Val He Pro val Arg Arg Lys Asp Asp Arg Arg Gly Ala 

1145 1150 1155 

Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly 

1160 1165 1170 

Gly Pro Val Leu Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg 

1175 1180 1185 

Ala Ala Val Cys Ala Arg Gly Val Ala Lys Ser He Asp Phe He 

1190 1195 1200 

Pro Val Glu Ser Leu Asp He Ala Thr Arg Thr Pro Ser Phe Ser 

1205 1210 1215 

Asp Asn Ser Ala Pro Pro Ala Val Pro Gin Ser Tyr Gin Val Gly 

1220 1225 1230 

Tyr Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 

1235 1240 1245 

Ala Ala Tyr Ala Ser Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
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1250 1255 1260 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Het Ser Lys Ala 

1265 1270 1275 

His Gly He Asn Pro Asn He Arg Thr Gly Val Arg Thr Val Thr 

1280 1285 1290 

Thr Gly Asp Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe He Ala 

1295 1300 1305 

Asp Gly Gly Cys Ala Ala Gly Ala Tyr Asp He He He Cys Asp 

1310 1315 1320 

Glu Cys His Ser Val Asp Ala Thr Thr He Leu Gly He Gly Thr 

1325 1330 1335 

Val Leu Asp Gin Ala Glu Thr Ala Gly Val Arg Leu Val Val Leu 

1340 1345 1350 

Ala Thr Ala Thr Pro Pro Gly Thr Val Thr Thr Pro His Ser Asn 

1355 1360 1365 

He Glu Glu Val Ala Leu Gly His Glu Gly Glu He Pro Phe Tyr 

1370 1375 1380 

Gly Lys Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu 

1385 1390 1395 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Ala 

1400 1405 1410 

Leu Arg Gly Thr Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu 

1415 1420 1425 

Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 

1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

1445 1450 1455 

He Asp Cys Asn Val Ala Val Ser Gin He Val Asp Phe Ser Leu 

1460 1465 1470 
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Asp Pro Thr Phe Thr He Thr Thr Gin Thr Val Pro Gin Asp Ala 

1475 1480 1485 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu 

1490 1495 1500 

Gly He Tyr Arg Tyr Val Ser Ser Gly Glu Gly Pro Ser Gly Met 

1505 1510 1515 

Phe Asp Ser Val Val Pro Cys Glu Cys Tyr Asp Ala Gly Ala Ala 

1520 1525 1530 

Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 

1535 1540 1545 

Tyr Phe Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu 

1550 1555 1560 

Phe Trp Glu Ala Val Phe Thr Gly Leu Thr His He Asn Ala His 

1565 1570 1575 

Phe Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Ala Tyr Leu 

1580 1585 1590 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro 

1595 1600 1605 

Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr Arg Leu Lys Pro 

1610 1615 1620 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 

1625 1630 1635 

Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr He Ala 

1640 1645 1650 

Thr Cys Met Gin Ala Asp Leu Glu He Met Thr Ser Ser Trp Val 

1655 1660 1665 

Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 

1670 1675 1680 

Thr Gly Cys He Ser He He Gly Arg Leu His Leu Asn Asp Arg 
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1685 1690 1695 

Val Val Val Thr Pro Asp Lys Glu He Leu Tyr Glu Ala Phe Asp 

1700 1705 1710 

Glu Het Glu Glu Cys Ala Ser Lys Ala Ala Leu He Glu Glu Gly 

1715 1720 1725 

Gin Arg Het Ala Glu Het Leu Lys Ser Lys He Gin Gly Leu Leu 

1730 1735 1740 

Gin Gin Ala Thr Arg Gin Ala Gin Gly Het Gin Pro Ala He Gin 

1745 1750 1755 

Ser Ser Trp Pro Lys Leu Glu Gin Phe Trp Ala Lys His Met Trp 

1760 1765 1770 

Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 

1775 . 1780 1785 

Pro Gly Asn Pro Ala Val Ala Ser Het Het Ala Phe Ser Ala Ala 

1790 1795 1800 

Leu Thr Ser Pro Leu Pro Thr Ser Thr Thr He Leu Leu Asn He 

1805 1810 1815 

Het Gly Gly Trp Leu Ala Ser Gin He Ala Pro Pro Ala Gly Ala 

1820 1825 1830 

Thr Gly Phe Val Val Ser Gly Leu Val Gly Ala Ala Val Gly Ser 

1835 1840 1845 

He Gly Leu Gly Lys He Leu Val Asp Val Leu Ala Gly Tyr Gly 

1850 1855 1860 

Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He Het Ser Gly 

1865 1870 1875 

Glu Lys Pro Thr Val Glu Asp Val Val Asn Leu Leu Pro Ala He 

1880 1885 1890 

Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala He 

1895 1900 1905 
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Leu Arg Arg His Val Gly Gin Gly Glu Gly Ala Val Gin Trp Met 

1910 1915 1920 

Ash Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro 

1925 1930 1935 

Thr His Tyr Val Val Glu Ser Asp Ala Ser Gin Arg Val Thr Gin 

1940 1945 1950 

Val Leu Ser Sen Leu Thr He Thr Ser Leu Leu Arg Arg Leu His 

1955 1960 1965 

Ala Trp lie Thr Glu Asp Cys Pro He Pro Cys Ser Gly Ser Trp 

1970 1975 1980 

Leu Gin Asp He Trp Asp Trp Val Cys Ser He Leu Thr Asp Phe 

1985 1990 1995 

Lys Asn Trp Leu Ser Ser Lys Leu Leu Pro Lys Met Pro Gly He 

2000 2005 2010 

Pro Phe He Ser Cys Gin Lys Gly Tyr Lys Gly Val Trp Ala Gly 

2015 2020 2025 

Thr Gly Val Met Thr Thr Arg Tyr Pro Cys Gly Ala Asn He Ser 

2030 2035 2040 

Gly His Val Arg Met Gly Thr Het Lys He Thr Gly Pro Lys Thr 

2045 2050 2055 

Cys Leu Asn Leu Trp Gin Gly Thr Phe Pro He Asn Cys Tyr Thr 

2060 2065 2070 

Glu Gly Pro Cys vai Pro Lys Pro Pro Pro Asn Tyr Lys Thr Ala 

2075 2080 2085 

He Trp Arg Val Ala Ala Ser Glu Tyr Val Glu Val Thr Gin His 

2090 2095 2100 

Gly Ser Phe Ser Tyr Val Thr Gly Leu Thr Ser Asp Asn Leu Lys 

2105 2110 2115 

Val Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Ser Trp Val Asp 
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2120 2125 2130 

Gly Val Gin He His Arg Phe Ala Pro Val Pro Gly Pro Phe Phe 

2135 2140 2145 

Arg Asp Glu Val Thr Phe Thr Val Gly Leu Asn Ser Phe Val Val 

2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Glu Val Leu 

2165 2170 2175 

Ala Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala 

2180 2185 2190 

Ala Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Gin Ala Ser Ser 

2195 2200 2205 

Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr cys Thr 

2210 2215 2220 

Thr His Lys Thr Ala Tyr Asp Cys Asp Met Val Asp Ala Asn Leu 

2225 2230 2235 

Phe Het Gly Gly Asp Val Thr Arg He Glu Ser Asp Ser Lys Val 

2240 2245 2250 

He Val Leu Asp Ser Leu Asp Ser Het Thr Glu Val Glu Asp Asp 

2255 2260 2265 

Arg Glu Pro Ser Val Pro Ser Glu Tyr Leu He Lys Arg Arg Lys 

2270 2275 2280 

Phe Pro Pro Ala Leu Pro Pro Trp Ala Arg Pro Asp Tyr Asn Pro 

2285 2290 2295 

Val Leu He Glu Thr Trp Lys Arg Pro Gly Tyr Glu Pro Pro Thr 

2300 2305 2310 

Val Leu Gly Cys Ala Leu Pro Pro Thr Leu Gin Thr Pro Val Pro 

2315 2320 2325 

Pro Pro Arg Arg Arg Arg Ala Lys He Leu Thr Gin Asp Asp Val 

2330 2335 2340 
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6lu Gly He Leu Arg Glu Met Ala Asp Lys Val Leu Ser Pro Leu 

2345 2350 2355 

6ln AsD Asn Asn Asp Ser Gly His Ser Thr Gly Ala Asp Thr Gly 

2360 2365 2370 

Gly Asp He Val Gin Gin Pro Ser Asp Glu Thr Ala Ala Ser Glu 

2375 2380 2385 

Ala Gly Ser Leu Ser Ser Het Pro Pro Leu Glu Gly Glu Pro Gly 

2390 2395 2400 

Asp Pro Asp Leu Glu Phe Glu Pro Val Gly Ser Ala Pro Pro Ser 

2405 2410 2415 

Glu Gly Glu Cys Glu Val He Asp Ser Asp Ser Lys Ser Trp Ser 

2420 2425 2430 

Thr Val Ser Asp Gin Glu Asp Ser Val He Cys Cys Ser Het Ser 

2435 2440 2445 

Tyr Ser Trp Thr Gly Ala Leu He Thr Pro Cys Gly Pro Glu Glu 

2450 2455 2460 

Glu Lys Leu Pro He Asn Pro Leu Ser Asn Ser Leu Het Arg Phe 

2465 2470 2475 

His Asn Lys Val Tyr Ser Thr Thr Ser Arg Ser Ala Ser Leu Arg 

2480 2485 2490 

Ala Lys Lys Val Thr Phe Asp Arg Val Gin Val Leu Asp Ala His 

2495 2500 2505 

Tyr Asp Ser Val Leu Gin Asp Val Lys Arg Ala Ala Ser Lys Val 

2510 2515 2520 

Gly Ala Arg Leu Leu Thr Val Glu Glu Ala Cys Ala Leu Thr Pro 

2525 2530 2535 

Pro His Ser Ala Lys Ser Arg Tyr Gly Phe Gly Ala Lys Glu Val 

2540 2545 2550 

Arg Ser Leu Ser Arg Arg Ala Val Asn His He Arg Ser Val Trp 
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2555 2560 2565 

Glu Asn Leu Leu Glu Asp Gin Arg Thr Pro He Asp Thr Thr lie 

2570 2575 2580 
Het Ala Lys Asn Glu Val Phe Cys He Asp Pro Thr Lys Gly Gly 

2585 2590 2595 

Lys Lys Pro Ala Arg Leu lie Val Tyr Pro Asp Leu Gly Val Arg 

2600 2605 2610 
Val Cys Glu Lys Het Ala Leu Tyr Asp He Thr Gin Lys Leu Pro 

2615 2620 2625 
Lys Ala He Het Gly Pro Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 

2630 2635 2640 

Glu Arg Val Asp Phe Leu Leu Lys Ala Trp Gly Ser Lys Lys Asp 

2645 2650 2655 

Pro Het Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 

2660 2665 2670 

Thr Glu Arg Asp He Arg Thr Glu Glu Ser He Tyr Gin Ala Cys 

2675 2680 2685 

Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr Glu 

2690 2695 2700 

Arg Leu Tyr Val Gly Gly Pro Het Thr Asn Ser Lys Gly Gin Ser 

2705 2710 2715 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 

2720 2725 2730 

Het Gly Asn Thr Het Thr Cys Tyr He Lys Ala Leu Ala Ala Cys 

2735 2740 2745 

Lys Ala Ala Gly He Val Asp Pro Val Het Leu Val Cys Gly Asp 

2750 2755 2760 

Asp Leu Val Val He Ser Glu Ser Gin Gly Ash Glu Glu Asp Glu 

2765 2770 2775 
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Arg Asn Leu Arg Ala Phe Thr Glu Ala Hec Thr Arg Tyr Ser Ala 

2780 2785 2790 

Pro Pro Gly Asp Leu Pro Arg Pro Glu Tyr Asp Leu Glu Leu He 

2795 2800 2805 

Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu Asp Ser Arg Gly 

2810 2815 2820 

Arg Arg Arg Tyr Phe Leu Thr Arg Asp Pro Thr Thr Pro He Thr 

2825 2830 2835 

Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn Ser Trp 

2840 2845 2850 

Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Val Arg Het 

2855 2860 2865 

Val He Het Thr His Phe Phe Ser He Leu Leu Ala Gin Asp Thr 

2870 2875 2880 

Leu Asn Gin Asn Leu Asn Phe Glu Het Tyr Gly Ala Val Tyr Ser 

2885 2890 2895 

Val Asn Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly 

2900 2905 2910 

Leu Glu Ala Phe Ser Leu His Thr Tyr Ser Pro His Glu Leu Ser 

2915 2920 2925 

Arg Val Ala Ala Thr Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg 

2930 2935 2940 

Ala Trp Lys Ser Arg Ala Arg Ala Val Arg Ala Ser Leu He Ala 

2945 2950 2955 

Gin Gly Ala Arg Ala Ala He Cys Gly Arg Tyr Leu Phe Asn Trp 

2960 2965 2970 

Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Leu Pro Glu Ala Ser 

2975 2980 2985 

Arg Leu Asp Leu Ser Gly Trp Phe Thr Val Gly Ala Gly Gly Gly 
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